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(54) Title: THE C. ELEGANS GRO-1 GENE 
(57) Abstract 

The invention relates to the identifica- 
tion of gro-1 gene and to demonstrate that 
the gro-1 gene is involved in the control of 
a central physiological clock. Also disclosed 
are four other genes located within the same 
operon as the gro-1 gene. 
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THE C. ELEGANS GRO-1 GENE 

BACKGROUND OF THE INVENTION 

(a) Field of the Invention 

The invention relates to the identification of 
5 gro-1 gene and four other genes located within the same 
operon and to show that the gro-1 gene is involved in 
the control of a central physiological clock. 

(b) Description of Prior Art 

The gro-1 gene was originally defined by a 

10 spontaneous mutation isolated from of a Caenorhabditis 
elegans strain that had recently been established from 
a wild isolate (J. Hodgkin and T. Doniach, Genetics 
146: 149-164 (1997)). We have shown that the activity 
of the gro-1 gene controls how fast the worms live and 

15 how soon they die. The time taken to progress through 
embryonic and post-embryonic development, as well as 
the life span of gro-1 mutants is increased (Lakowski 
and Hekimi, Science 272:1010-1013, (1996)). Further- 
more, these defects are maternally rescuable: when 

20 homozygous mutants (gro-l/gro-1) derive from a 
heterozygous mother (gro-l/+) , these animals appear to 
be phenotypically wild-type. The defects are seen only 
when homozygous mutants derive from a homozygous mother 
(Lakowski and Hekimi, Science 272:1010-1013, (1996)). 

25 In general, the properties of the gro-1 gene are simi- 
lar to those of three other genes, clk-1, clk-2 and 
clk-3 (Wong et al. f Genetics 139: 1247-1259 (1995); 
Hekimi et ai., Genetics, 141: 1351-1367 (1995); 
Lakowski and Hekimi, Science 272:1010-1013, (1996)), 

30 and this combination of phenotypes has been called the 
Clk ("clock") phenotype. All four of these genes 
interact to determine developmental rate and longevity 
in the nematode. Detailed examination of the clk-1 
mutant phenotype has led to the suggestion that there 

35 exists a central physiological clock which coordinates 
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all or many aspects of cellular physiology, from cell 
division and growth to aging. All four genes have a 
similar phenotype and thus appear to impinge on this 
physiological clock. 
5 It would be highly desirable to be provided with 

the molecular identity of the gro-1 gene. 

SUMMARY OF THE INVENTION 

One aim of the present invention is to provide 
10 the molecular identity of the gro-1 gene and four other 
genes located within the same operon. 

In accordance with the present invention there 
is provided a gro-1 gene which has a function at the 
level of cellular physiology involved in developmental 
15 rate and longevity, wherein gro-1 is located within an 
operon and gro-1 mutants have a longer life and a 
altered cellular metabolism relative to the wild-type. 

In accordance with a preferred embodiment, the 
gro-1 gene of the present invention codes for a GRO-1 
20 protein having the amino acid sequence set forth in 
Figs. 3A-3B (SEQ ID. NO:2). 

The gro-1 gene is located within an operon which 
has the nucleotide sequence set forth in SEQ ID NO:l 
and which also codes for four other genes, referred as 
2 5 gop-1 f gop-2, gop-3 and hap-1 genes. 

In accordance with a preferred embodiment, the 
gop-1 gene of the present invention codes for a GOP-1 
protein having the amino acid sequence set forth in 
Figs. 13A-13C (SEQ ID. NO:4). 
30 In accordance with a preferred embodiment, the 

gop-2 gene of the present invention codes for a GOP-2 
protein having the amino acid sequence set forth in 
Fig. 14 (SEQ ID. NO:5) . 

In accordance with a preferred embodiment, the 
35 gop-3 gene of the present invention codes for a GOP-3 
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protein having the amino acid sequence set forth in 
Figs. 15A-15B (SEQ ID. NO: 6) . 

In accordance with a preferred embodiment, the 
hap-1 gene of the present invention codes for a HAP-1 
5 protein having the amino acid sequence set forth in 
Fig. 16 (SEQ ID. N0:7) . 

In accordance with a preferred embodiment of the 
present invention, the gro-1 gene is of human origin 
and has the nucleotide sequence set forth in Fig. 8 
10 (SEQ ID. NO:3) . 

In accordance with a preferred embodiment of the 
present invention, there is provided a mutant GRO-1 
protein which has the amino acid sequence set forth in 
Fig. 3C. 

15 in accordance with the present invention there 

is also provided a GRO-1 protein which has a function 
at the level of cellular physiology involved in devel- 
opmental rate and longevity, wherein said GRO-1 protein 
is encoded by the gro-1 gene identified above, 

20 In accordance with a preferred embodiment of the 

present invention, there is provided a GRO-1 protein 
which has the amino acid sequence set forth in Figs. 
3A-3B (SEQ ID. NO: 2) . 

In accordance with a preferred embodiment of the 

25 present invention, there is provided a GOP-1 protein 
which has the amino acid sequence set forth in Figs. 
13A-13C ( SEQ ID. NO:4). 

In accordance with a preferred embodiment of the 
present invention, there is provided a GOP-2 protein 

30 which has the amino acid sequence set forth in Fig. 14 
(SEQ ID. NO: 5) . 

In accordance with a preferred embodiment of the 
present invention, there is provided a GOP-3 protein 
which has the amino acid sequence set forth in Figs. 

35 15A-15B ( SEQ ID. NO: 6). 
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In accordance with a preferred embodiment of the 
present invention, there is provided a HAP-1 protein 
which has the amino acid sequence set forth in Fig. 16 
(SEQ ID, NO:7) . 

5 In accordance with the present invention there 

is also provided a method for the diagnosis and/or 
prognosis of cancer in a patient, which comprises the 
steps of: 

a) obtaining a tissue sample from said patient; 

10 b) analyzing DNA of the obtained tissue sample of 

step a) to determine if the human gro-1 gene is 
altered, wherein alteration of the human gro-1 gene is 
indicative of cancer. 

In accordance with the present invention there 

15 is also provided a mouse model of aging and cancer, 
which comprises a gene knock-out of murine gene homolo- 
gous to gro-1 . 

In accordance with the present invention there 
is provided the use of compounds interfering with enzy- 

20 matic activity of GRO-1, GOP-1, GOP-2, GOP-3 or HAP-1 
for enhancing longevity of a host. 

In accordance with the present invention there 
is provided the use of compounds interfering with enzy- 
matic activity of GRO-1, GOP-1, GOP-2, GOP-3 or HAP-1 

25 for inhibiting tumorous growth, 

BRIEF DESCRIPTION OF THE DRAWINGS 

Fig. 1A illustrates the genetic mapping of 

gro-1; 

30 Fig. IB illustrates the physical map of the 

gro-1 region; 

Fig. 2A illustrates cosmid clones able to rescue 
the gro-1 (e2400) mutant phenotype; 

Fig. 2B illustrates the genes predicted by 
35 Genefinder, the relevant restriction sites and the 
fragments used to subclone the region; 
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Figs. 3A-3B illustrate the genomic sequence and 
translation of the C. elegans gro-1 gene (SEQ. ID, 
NO: 2) ; 

Fig. 3C illustrates the predicted mutant pro- 

5 tein; 

Fig, 4A illustrates the five genes of the gro-1 
operon (SEQ. ID. NO:l); 

Fig, 4B illustrates the transplicing pattern of 
the five genes of the gro-1 operon; 
10 Fig. 5 illustrates the alignment of gro-1 with 

the published sequences of the E. coll (PI 638 4) and 
yeast (P07884) enzymes; 

Fig. 6 illustrates the biosynthetic step cata- 
lyzed by DMAPP transferase (MiaAp in E. coll, Mod5p in 
15 5. cerevislae, and GRO-1 in C. elegans); 

Fig. 7 illustrates the alignment of the pre- 
dicted HAP-1 amino acid sequence with homologues from 
other species; 

Fig. 8 illustrates the full mRNA sequence of 
2 0 human homologue of gro-1 referred to as hgro-1 (SEQ. 
ID. NO:3) ; 

Fig. 9 illustrates a comparison of the 
conceptual amino acid sequences for GRO-1 and hgro-lp; 

Fig. 10 illustrates a conceptual translation of 
25 a partial sequence of the Drosophila homologue of gro-1 
(AA816785) ; 

Fig. 11 illustrates the structure of pMQ8; 

Fig. 12 illustrates construction of pMQ18; 

Figs. 13A-13C illustrate the genomic sequence 
30 and translation of the gop-1 gene (SEQ. ID. NO: 4); 

Fig. 14 illustrates the genomic sequence and 
translation of the gop-2 gene (SEQ. ID. N0:5); 

Figs. 15A-15B illustrate the genomic sequence 
and translation of the gop~3 gene (SEQ. ID. NO: 6); and 
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Fig. 16 illustrates the genomic sequence and 
translation of the hap-1 gene (SEQ. ID. NO:7). 

DETAILED DESCRIPTION OF THE INVENTION 

5 

The gro-1 phenotype 

In addition to the previously documented pheno- 
types, we recently found that gro-1 mutants were tem- 
perature-sensitive for fertility. At 25°C the progeny 

10 of these mutants is reduced so much that a viable 
strain cannot be propagated. In contrast, gro-1 

strains can easily be propagated at 15 and 20°C. 

We also discovered that the gro-1 (e2400) muta- 
tion increases the incidence of spontaneous mutations. 

15 As gro-1 (e2400) was originally identified in a non- 
standard background (Hodgkin and Doniach, Genetics 146: 
149-164 (1997) ) , we first backcrossed the mutations 8 
times against N2, the standard wild type strain. We 
then undertook to examine the gro-1 strain and N2 for 

20 the occurrence of spontaneous mutants which could be 
identified visually. We focused on the two class of 
mutants which are detected the most easily by simple 
visual inspection, uncoordinated mutants (Unc) and 
dumpy mutants (Dpy) . We examined 8200 wild type worms 

25 and found no spontaneous visible mutant. By contrast, 
we found 6 spontaneous mutants among 12500 gro-1 
mutants examined- All mutants produced entirely mutant 
progeny indicating that they were homozygous. 
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Sequences of all primers used 



Name 


Orientation 


Sequence (S'^ 1 ) 


SEQ ID NO: 


SHP91 


forward 


CGAACACTTTATATTTCTCG 


SEQ. ID. NO:8 


SHP92 


reverse 


GATAGTTCCCTTCGTTCGGG 


SEQ. ID. NO:9 


SHP93 


forward 


TTTCTG G ATTTTAAC CTTCC 


SEQ. ID. NO. 10 


SHP94 


forward 


TTTCCGAGAAGTCACGTTGG 


SEQ. ID. NO, 11 


SHP95 


reverse 


TACAGGAAl 1 1 1 IGAACGGG 


obU. ID. NO. 12 


SHP96 


forward 


CTTCAGATGACGTGGATTCC 


CCA IA MA-4 0 


SHP97 


forward 


a a a a t-/*» a a a a a a a r^~rr > a a pt 
GGAA 1 UOGAAAAAvj 1 bAAU 1 


ccn in Nirvi A 
OtU. ILJ. INLJ. l*f 


SHP98 


forward 


A A /"» A a *T* A A A A *T* A A A "Ti^™* A A /~> 

AAGAGA 1 AOAO 1 OAA 1 C^C^GCp 


ofcA-i. ID. NO. no 


SHP99 


reverse 


A T A /"» ATA A A A A A^"*Trf* v T/* % Tf~ % 

ATCGATAOUAUOCj 1 U 1 Lr t 


ofcW. IU. INU.lD 


SHP109 


reverse 


-i — a A Tr*TA A A ATA A T A A 

TTGAA I o 1 AUAU 1 AA 1 ^AUU 


CCA ir\ MA.H7 


SHP100 


reverse 


UUAA 1 IAIUI I 1 I buMb I 


OCW. IU- InU. id 


nUDd A A 

bnHIIU 


forward 


MLvM 1 1 a 1 nAno 1 1 ML/ 1 Va 1 


cca in NOiQ 


onrilo 


Torwaru 


TTTTAfVri AAAf^PATTC-iAPP 


SEQ ID NO'20 




reverse 


APATPTTTATPPATTTPTPP 


SFQ ID NO*21 

UL.\t(. Ik/. I>I\-/.A. 1 


cud - } on 


Torwaru 


Tf5 P A A A d G PTPTG G A A PTC C 


SEQ ID NO'22 


QUD'i OQ 


re vers e 


AAAAAPPACTTGATATAAGG 


SEQ ID NO-23 


cuiPi 
Onr lOU 


1 cvcloc 


PATCCAAAAGCAGTATCACC 


SEQ- ID. NO:24 


SHP134 


forward 


TTAATTGGATGCAAGCACCCC 


SEQ. ID. NO:25 


SHP135 


reverse 


ATTACTATACGAACATTTCC 


SEQ. ID. NO:26 


SHP138 


forward 


TTGTAAAGGCGTTAGTTTGG 


SEQ. ID. NO:27 


SHP139 


forward 


C AG G AGTATTTG GTG ATG CG 


SEQ. ID. NO:28 


SHP140 


forward 


CGACGGGGAGAAGGTGACGG 


SEQ. ID. NO:29 


SHP141 


reverse 


AAAACTTCTACCAACAATGG 


SEQ. ID. NO:30 


SHP142 


reverse 


CGTAATCTCTCTCGATTAGC 


SEQ. ID. NO:31 


SHP143 


reverse 


CCGTGG G ATGG CTACTTG CC 


SEQ. ID. NO:32 


SHP144 


reverse 


TGGATTTGTGGCACGAGCGG 


SEQ. ID. NO:33 


SHP145 


reverse 


TTGATTGCCTCTCCTCGTCC 


SEQ. ID. NO:34 


SHP146 


reverse 


ATCAACATCTGATTGATTCC 


SEQ. ID. NO:35 


SHP151 


forward 


CAGCGAGCGCATGCAACTATATATTG 
AGCAGG 


SEQ. ID. NO:36 


SHP159 


forward 


AATAAATATTTAAATATTCAGATATACC 
CTGAACTCTACAG 


SEQ. ID. NO:37 


SHP160 


reverse 


AAACTGTAGAGTTCAGGGTATATCTG 
AATATTTAAATATTTATTC 


SEQ. ID. NO:38 
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qHP161 
on* i \j i 


forward 


GTACGTGGAGCTCTGCAACTATATATT 
GAGCAGG 


SEQ. ID. NO:39 


SHP162 


reverse 


ATG AC ACTG C AGG ATAGTTC CCTTC G 
TTCGGG 


SEQ. ID. NO:40 


SHP163 


forward 


GTGTTGCATCAGTTCATTCC 


SEQ. ID. NO:41 


SHP164 


forward 


GCTGTGCTAGAAGTCAGAGG 


SEQ. ID. NO:42 


SHP165 


reverse 


GTTCTCCTTGGAATTCATCC 


SEQ. ID. NO:43 


SHP170 


reverse 


AGTATATCTAGATGTGCGAGTCTCTG 
CCAATT 


SEQ. ID. NO:44 


SHP171 


reverse 


AGTAATTGTAC ATTTAGTG G 


SEQ. ID. NO:45 


SHP172 


forward 


ATTAACCTTACTTACTTACC 


SEQ. ID. NO:46 


SHP173 


forward 


CTAAACTAAGTAATATAACC 


SEQ. ID. NO:47 


SHP174 


reverse 


GTTGATTCTTTGAGCACTGG 


SEQ. ID. NO:48 


SHP175 


forward 


AATTCGACCAATTACATTGG 


SEQ. ID. NO:49 


SHP176 


reverse 


AACATAGTTGTTGAGGAAGG 


SEQ. ID. NO:50 


SHP177 


forward 


AATTAATGGAGATTCTACGG 


SEQ. ID. NO:51 


SHP178 


forward 


TCAGCATCTAGAAATGCAGG 


SEQ. ID. NO:52 


SHP179 


reverse 


CGAATGTCAACATTCACTGG 


SEQ. ID. NO:53 


SHP180 


forward 


CTTAACCTGATGTGTACTCG 


SEQ. ID. NO:54 


SHP181 


forward 


ATGAAGCTTTAGAGGATGCC 


SEQ. ID. NO:55 


SHP182 


forward 


CGACGAATTTCTGGAGTCGG 


SEQ. ID. NO:56 


SHP183 


reverse 


ACTGCATTATCCATTAATCC 


SEQ. ID. NO:57 


SHP184 


reverse 


CAC CC AAATA AC ATCTATC C 


SEQ. ID. NO:58 


SHP185 


forward 


TTTAACCTCATCTTCGCTGG 


SEQ. ID. NO:59 


SHP190 


forward 


ATGTTC CG C AAG CTTGGTTC 


SEQ. ID. NO:60 


SL1 


forward 


TTTAATTACCCAAGTTTGAG 


SEQ. ID. NO:61 


SL2 


forward 


1 I 1 IAACCCAGI I ACTCAAG 


SEQ. ID. NO:62 



Positional cloning of gro-1 

gro-1 lies on linkage group III, very close to 
the gene clk-1. To genetically order gro-1 with 
5 respect to clk-1 on the genetic map, 54 recombinants in 
the dpy-17 to lon-1 interval were selected from among 
the self progeny of a strain which was unc-7 9 (el030) + 
+ clk-1 (e2519) lon-1 (e678) +/+ dpy-17 (e!64) gro- 
l(e2400) + sma-4 (e729) . Three of these showed neither 
10 the Gro-1 nor the Clk-1 phenotypes, but carried unc-79 
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and sma-4, indicating that these recombination events 
had occurred between gro-1 and clk-1. From the dispo- 
sition of the markers, this showed that the gene order 
was dpy-17 gro-1 clk-1 lon-1, and the frequency of 
5 events indicated that the gro-1 to clk-1 distance was 

0. 03 map units* In this region of the genome, this 
corresponds to a physical map distance of -20 kb . 

Several cosmids containing wild-type DNA span- 
ning this region of the genome were tested by microin- 
10 jection into gro-1 mutants for their ability to comple- 
ment the gro-1 (e2400) mutation (Fig. 1). gro-1 was 
mapped between dpy-17 and lon-1 on the third chromo- 
some, 0.03 m.u. to the left of clk-1 (Fig. 1A) . 

Based on the above genetic mapping, gro-1 was 
15 estimated to be approximately 20 kb to the left of clk- 

1. Eight cosmids (represented by medium bold lines) 
were selected as candidates for transformation rescue 
(Fig. IB) . Those which were capable of rescuing the 
gro-1 (e2400) mutant phenotype are represented as heavy 

20 bold lines (Fig. IB) . 

Of these, only B0498, C34E10 and ZC395 were able 
to rescue the mutant phenotype. Transgenic animals 
were fully rescued for developmental speed. In 
addition, the transgenic DNA was able to recapitulate 

25 the maternal rescue seen with the wild-type gene, that 
is, mutants not carrying the transgenic DNA but derived 
from transgenic mothers display a wild type phenotype. 
The 7 kb region common to the three rescuing cosmids 
had been completely sequenced, and this sequence was 

30 publicly available. 

We generated subclones of ZC395 and assayed them 
for rescue (Fig. 2). The common 6.5 kb region is blown 
up in part B. B04 98 has not been sequenced and 
therefore its ends can not be positioned and are there- 

35 fore represented by arrows. 
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One subclone pMQ2, spanned 3,9 kb and was also 
able to completely rescue the growth rate defect and 
recapitulate the maternal effect. The sequences in 
pMQ2 potentially encodes two genes . However, a second 
5 subclone, pMQ3, which contained only the first of the 
potential genes (named ZC395.7 in Fig. 2A) , was unable 
to rescue. 

Furthermore, frameshifts which would disrupt 
each of the two genes" coding sequences were con- 

10 structed in pMQ2 and tested for rescue. Disruption of 
the first gene (in pMQ4) did not eliminate rescuing 
ability, but disruption of the second gene (in pMQ5) 
did. This indicates that the gro-1 rescuing activity is 
provided by the second predicted gene. 

15 pMQ2 was generated by deleting a 29.9 kb Spel 

fragment from ZC395, leaving the left-most 3.9 kb 
region containing the predicted genes ZC395.7 and 
ZC395.6 (Fig. 2B) . pMQ3 was created in the same fash- 
ion, by deleting a 31.4 kb Ndel fragment from ZC395, 

20 leaving only ZC395.7 intact. In pMQ4 , a frameshift was 
induced in ZC395.7 by degrading the 4 bp overhang of 
the Apal site. A frameshift was also induced in pMQ5 
by filling in the 2 bp overhang of the Ndel site found 
in the second exon of ZC395.6. These frameshifts pre- 

25 sumably abolish any function of ZC395.7 and ZC395.6 
respectively. The dotted lines represent the extent of 
frameshift that resulted from these alterations. 

To establish the splicing pattern of this gene, 
cDNAs encompassing the 5 1 and 3 f halves of the gene 

30 were produced by reverse transcription-PCR and 
sequenced (Fig* 3) . 

This revealed that the gene is composed of 9 
exons, spans -2 kb, and produces an mRNA of 1.3 kb. To 
confirm that this is indeed the gro-1 gene, genomic DNA 

35 was amplified by PCR from a strain containing the gro- 
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1 (e2400) mutation and the amplified product was 
sequenced. A lesion was found in the 5th exon, where a 
9 base-pair sequence has been replaced by a 2 base-pair 
insertion, leading to a frameshift (Fig. 3C) . Fig. 3C 
5 illustrates those residues which differ from wild type 
are in bold. 

The reading frame continues out-of-frame for 
another 33 residues before terminating. 

Figs. 3A-B illustrate the coding sequence in 

10 capital letters, while the introns, and the untrans- 
lated and intergenic sequence are in lower case let- 
ters. The protein sequence is shown underneath the 
coding sequence. Position 1 of the nucleotide sequence 
is the first base after the SL2 trans-splice acceptor 

15 sequence. Position 1 of the protein sequence is the 
initiator methionine. All PCR primers used for genomic 
and cDNA amplification are represented by arrows. For 
primers extending downstream (arrows pointing right) 
the primer sequence corresponds exactly to the nucleo- 

20 tides over which the arrow extends. But for primers 
extending upstream (arrows pointing left) the primer 
sequence is actually the complement of the sequence 
under the arrow. In both cases the arrow head is at 
the 3 1 end of the primer. The sequence of the two 

25 primers which flank gro-1 (SHP93 and SHP92) are not 
represented in this figure. Their sequences are: SHP93 
TTTCTGGATTTTAACCTTCC (SEQ. ID. NO: 10) and SHP92 
GATAGTTCCCTTCGTTCGGG (SEQ. ID. NO: 9). The wild type 
splicing pattern was determined by sequencing of the 

30 cDNA. Identification of the e2400 lesion was 

accomplished by sequencing the e2400 allele. The e2400 
lesion consists of a 9 bp deletion and a 2 bp insertion 
at position 1196, resulting in a frameshift. 
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gro-1 is part of a complex operon (Figs. 3A-3B) 

Amplification of the 5 1 end of gro-1 from cDNA 
occurred only when the trans-spliced leader SL2 was 
used as the 5 r primer, and not when SL1 was used. SL2 
5 is used for trans-splicing to the downstream gene when 
two genes are organized into an operon (Spieth et al . , 
Cell 73: 521-532 (1993); Zorio et al . , Nature 372: 270- 
272 (1994))* This indicates that at least one gene 
upstream of gro-1 is co-transcribed with gro-1 from a 

10 common promoter. We found that sequences from the 5* 
end of the three next predicted genes upstream of gro-1 
(ZC395.7, C34E10.1, and C34E10.2) all could only be 
amplified with SL2 . Sequences from the fourth 

predicted upstream gene (C34E10.3), however, could be 

15 amplified with neither spliced leader, suggesting that 
it is not t_rans-spliced. The distance between genes in 
operons appear to have an upper limit (Spieth et al . , 
Cell 73: 521-532 (1993); Zorio et al,, Nature 372: 270- 
272 (1994)), and no gene is predicted to be close 

20 enough upstream of C34E10.3 or downstream of gro-1 to 
be co-transcribed with these genes. Our findings sug- 
gest therefore that gro-1 is the last gene in an operon 
of five co-transcribed genes (Fig. 4) . 

Nested PCR was used to amplify the 5' end of 

25 each gene, SL1 or SL2 specific primers were used in 
conjunction with a pair of gene-specific primers. cDNA 
generated by RT-PCR using mixed stage N2 RNA was used 
as template in the nested PCR. Fig. 4A illustrates a 
schematic of the gro-1 operon showing the coding 

30 sequences of each gene and the primers (represented by 
flags) used to establish the trans-splicing patterns. 

Fig. 4B illustrates the products of the PCR with 
SL1 and SL2 specific primers for each of the five 
genes. The sequences of the primers used are as fol- 

35 lows: SL1: T TTAAT TACCCAAGT T TGAG (SEQ. ID. NO: 61), SL2 : 
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TTT TAAC C C AG TTACT C AAG 


{ SEQ . 


ID. 


NO: 


62) , 


SHP141 : 


AAAAC T T C T AC C AAC AA T G G 


(SEQ. 


ID. 


NO: 


30) , 


SHP142 : 


CGTAATCTCTCTCGATTAGC 


(SEQ. 


ID. 


NO: 


31) , 


SHP143: 


CCGTGGGATGGCTACTTGCC 


(SEQ . 


ID. 


NO: 


32) , 


SHP144 : 


TGGATTTGTGGCACGAGCGG 


(SEQ. 


ID. 


NO: 


33) , 


SHP145: 


TTGATTGCCTCTCCTCGTCC 


(SEQ. 


ID. 


NO: 


34) , 


SHP146: 


AT CAACAT CTGATTGATTCC 


(SEQ. 


ID. 


NO: 


35) , 


SHP130: 


CAT C C AAAAG C AG TAT C AC C 


(SEQ. 


ID. 


NO: 


24) , 


SHP119: 


ACATCTTTATCCATTTCTCC 


(SEQ . 


ID. 


NO:21) , 


SHP95: 


T AC AGGAAT TTTT GAAC GGG 


(SEQ. 


ID. 


NO 


:12) , 


SHP99: 


ATCGATACCACCGTCTCTGG 


(SEQ. ID. 


NO:16) 









The gene immediately upstream of gro-1, has 
homology to the yeast gene HMdl, and we have renamed 
the gene hap-1 . We have established its splicing pat- 

15 tern by reverse transcription PCR and sequencing. This 
revealed that hap-1 is composed of 5 exons and produces 
an mRNA of 0.9 kb. We also found that sequences which 
were predicted to belong to ZC395.7 (now hap-1) are in 
fact spliced to the exons of C34E10.1. This is consis- 

20 tent with our finding that hap-1 is SL2 spliced as it 
puts the end of the C34E10.1 very close to the start of 
hap-1 (Fig. 4) . 
The gro-1 gene product 

Conceptual translation of the gro-1 transcript 

25 indicated that it encodes a protein of 430 amino acids 
highly similar to a strongly conserved cellular enzyme: 
dimethylallyldiphosphate : tRNA dimethylallyl trans f erase 
(DMAPP transferase) . Fig. 5 shows an alignment of gro- 
1 with the published sequences of the E. coli (Pi 6384) 

30 and yeast (P07884) enzymes. Residues where the 

biochemical character of the amino acids is conserved 
are shown in bold. Identical amino acids are indicated 
further with a dot. The ATP/GTP binding site and the 
C2H2 zinc finger site are predicted and not 

35 experimental. The point at which the gro-1 (e2400) 
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mutation alters the reading frame of the sequence is 
shown. The two alternative initiator methionines in 
the yeast sequence, and the putative corresponding 
methionines in the worm sequence, are underlined. 
5 Database searches also identified a homologous 

human expressed sequence tag (Genbank ID: Z40724) . The 
human clone has been used to derive a sequence tagged 
site (STS) . This means that the genetic and physical 
position of the human gro-1 homologue is known. It 

10 maps to chromosome 1, 122.8 cR from the top of Chr 1 
linkage group and between the markers D1S255 and 
D1S2861. This information was found in the UniGene 
database or the National Center for Biotechnology 
Information (NCBI) . We have sequenced Z4 0724 by 

15 classical methods but found that Z40724 is not a full 
length cDNA clone as it does not contain an initiator 
methionine nor the poly A tail. We used the sequence of 
Z40724 to identify further clones by database searches. 
We found one clone (Genbank ID: AA332152) which 

20 extended the sequence 5 f by 28 nucleotides, as well as 
one clone (Genbank ID: AA121465) which extended the 
sequence substantially in the 3' direction but didn't 
include the poly A tail. We then used AA121465 to 
identify an additional clone (AA847885) extending the 

25 sequence to the poly A tail. Fig. 8 shows the full 
sequence with the putative initiator ATG shown in bold 
and the sequence of Z60724 is shown underlined. A 
comparison of the conceptual amino acid sequences for 
GRO-1 and hgro-lp is shown in Fig. 9. Amino acid 

30 identities are indicated by a dot. Both sequences 
contain a region with a zinc finger motif which is 
shown underlined. 

An additional metazoan homologue is represented 
by Drosophila EST: Genbank accession: AA816785. In E. 

35 coli and other bacteria, the gene encoding DMAPP trans- 
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f erase is called miaA (a-k.a trpX) and is called mod5 
in yeast. DMAPP transferase catalyzes the modification 
of adenosine 37 of tRNAs whose anticodon begins with U 
(Fig. 6) . 

5 In these organisms the enzyme has been shown to 

use dimethylallyldiphosphate as a donor to generate 
dimethylallyl-adenosine (dma 6 A37), one base 3 1 to the 
anticodon (for review and biochemical characterization 
of the bacterial enzyme see Persson et al . , Biochimie 
10 76: 1152-1160 (1994); Leung et al . , J Biol Chem 272: 
13073-13083 (1997); Moore and Poulter, Biochemistry 
36:604-614 (1997)). In earlier literature this modifi- 
cation is often referred to as isopentenyl adenosine 
(i 6 A37) . 

15 The high degree of conservation of the protein 

sequence between GRO-1 and DMAPP in S. cerevisiae and 
E. coli suggest that GRO-1 possesses the same enzymatic 
activity as the previously characterized genes. The 
sequence contains a number of conserved structural 

20 motifs (Fig. 5), including a region with an ATP/GTP 
binding motif which is generally referred to as the T A T 
consensus sequence (Walker et al. , EMBO J 1: 945-951 
(1982)) or the 'P-loop 1 (Saraste et al . , Trends Biochem 
Sci 15: 430-434 (1990) ) . 

25 In addition, at the C-terminal end of the GRO-1 

sequence, there is a C2H2 zinc finger motif as defined 
by the PROSITE database. This type of DNA-binding 
motif is believed to bind nucleic acids (Klug and 
Rhodes, Trends Biochem Sci 12: 464-469 (1987)). 

30 Although there appears to be some conservation between 
the worm and yeast sequences in the C-terminus end of 
the protein (Fig. 5), including in the region encom- 
passing the zinc finger in GRO-1, the zinc finger motif 
per se is not conserved in yeast but is present in 

35 humans (Fig. 9) . 
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In yeast DMAPP transferase is the product of the 
MOD5 gene, and exists in two forms: one form which is 
targeted principally to the mitochondria, and one form 
which is found in the cytoplasm and nucleus. These two 
5 forms differ only by a short N-terminal sequence whose 
presence or absence is determined by differential 
translation initiation at two "in frame" ATG codons. 
(Gillman et al., Mol & Cell Biol 11: 2382-90 (1991)). 
The gro-1 open reading frame also contains two ATG 

10 codons at comparable positions, with the coding 
sequence between the two codons constituting a plausi- 
ble mitochondrial sorting signal (Figs. 3 and 5) . It is 
likely therefore that DMAPP transferase in worms also 
exists in two forms, mitochondrial and cytoplasmic. 

15 It should be noted, however, that the sequence 

of hgro-1 shows only one in-frame methionine before the 
conserved ATP/GTP binding site (Fig. 9) . As we cannot 
be assured to have determined the sequence of the full 
length transcript, it is possible that further 5' 

20 sequence might reveal an additional methionine. 
Alternatively, in humans, the mechanism by which the 
enzyme is targeted to several compartments might not 
involved differential translation initiation. In this 
context, it should be noted that the sorting signals 

25 which can be predicted from the sequence of hgro-lp are 
predicted to be highly ambiguous by the prediction 
program PSORT II. Furthermore, a conceptual translation 
of the Drosophila sequence (AA816785) predicts only one 
initiator methionine before the ATP/GTP binding site as 

30 well as several in-frame stop codons upstream of this 
start (Fig. 10), suggesting that no additional upstream 
ATG could serve as translation initiation site. In the 
figure, stop codons are indicated by stop , methionines 
are indicated by Met, and the conserved ATP/GTP binding 

35 site is underlined. 
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Expression pattern of GRO-1 

We have also constructed a reporter gene 
expressing a fusion protein containing the entire GRO-1 
amino acid sequence fused at the C-terminal end to 
5 green fluorescent protein (GFP) . The promotor of the 
reporter gene is the sequence upstream of gop-1 
(Figs. 13A-13C) , the first gene in the operon (see 
Fig. 4) . The promotor sequence is 306 bp long starting 
32 nucleotides upstream of the gop-1 ATG . It is fused 

10 at the exact level upstream of gro-1 where trans- 
splicing to SL2 normally occurs. 

The genes gop-2 (Fig. 14) and gop-3 (Figs. 15A- 
15B) are also located in the operon (see Fig. 4), the 
second and third genes in the operon. 

15 we first construct the clone pMQ8 in which gro-1 

is directly under the promoter for the whole operon 
using the hybrid primers SHP160 (SEQ . ID. NO: 38) and 
SHP159 (SEQ ♦ ID. NO: 37) and the flanking primers SHP161 
(SEQ. ID. NO:39) and SHP162 (SEQ. ID. NO:40) in 

20 sequential reactions each followed by purification of 
the products and finally cloning into pUC18 (Fig. 11) . 

Primers SHP151 (SEQ. ID. NO:36) and SHP170 (SEQ. 
ID. NO: 44) where then used to amplify part of the 
insert in pMQ8 and clone in pPD95.77 (gift from Dr 

25 Andrew Fire) which was designed to allow a protein of 
interest to be transcriptionally fused to Green 
Fluorescent Protein (GFP) (Fig. 12) . 

The reporter construct fully rescues the 
phenotype of a gro-1 (e2400) mutant upon injection and 

30 extrachromosomal array formation, indicating that the 
fusion to the GFP moiety does not significantly inhibit 
the function of GRO-1. Fluorescent microscopy indicated 
that gro-1 is expressed in most or all somatic cells. 
Furthermore, the GRO-1: : GFP fusion protein is localized 
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in the mitochondria, in the cytoplasm as well as in the 
nucleus. 

The hap~l gene product (Fig. 16) 

hap-1 is homologous to the yeast gene HAM1 as 
5 well as to sequences in many organisms including bacte- 
ria and mammals (Fig. 7) . 

The origin of the worm and yeast sequence is as 
described above and below. The human sequence was 
inferred from a cDNA sequence assembled from expressed 

10 sequence tags (ESTs) ; the accession numbers of the 
sequences used were: AA024489, AA024794, AA025334, 
AA026396, AA026452, AA026502, AA026503, AA026611, 
AA026723, AA035035, AA035523, AA047591, AA047599, 
AA056452, AA115232, AA115352, AA129022, AA129023, 

15 AA159841, AA160353, AA204926, AA226949, AA227197 and 
D20115. The E. coli sequence is a predicted gene 
(accession 1723866) . 

Mutations in HAM1 increase the sensitivity of 
yeast to the mutagenic compound 6-N-hydroxylaminopurine 

20 (HAP) , but do not increase spontaneous mutation fre- 
quency (Nostov et al., Yeast 12:17-29 (1996)). HAP is 
an analog of adenine and in vitro experiments suggest 
that the mechanism of HAP mutagenesis is its conversion 
to a deoxynucleoside triphosphate which is incorporated 

25 ambiguously for dATP and dGTP during DNA replication 
(Abdul-Masih and Bessman, J Biol Chem 261 (5) : 2020- 
2026 (1986)). The role of the Hamlp gene product in 
increasing sensitivity to HAP remains unclear. 
Explaining the pleiotropy of miaA and gro-1 

30 Mutations in miaA, the bacterial homologue of 

gro-1, show multiple phenotypes and affect cellular 
growth in complex ways. For example, in Salmonella 
typhimurium, such mutations result in 1) a decreased 
efficacy of suppression by some suppressor tRNA, 2) a 

35 slowing of ribosomal translation, 3) slow growth under 
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various nutritional conditions, 4) altered regulation 
of several amino acid biosynthetic operons, 5) sensi- 
tivity to chemical oxidants and 6) temperature sensi- 
tivity for aerobic growth (Ericson and Bjork, J. Bacte- 
5 riol. 166: 1013-1021 (1986); Blum, J. Bacterid. 170: 
5125-5133 (1988)). Thus, MiaAp appears to be important 
in the regulation of multiple parallel processes of 
cellular physiology. Although we have not yet explored 
the cellular physiology of gro-1 mutants along the 

10 lines which have been pursued in bacteria, the appar- 
ently central role of mlaA is consistent with our find- 
ings that gro-1, and the other genes with a Clk pheno- 
type, regulate many disparate physiological and meta- 
bolic processes in C. elegans (Wong et al . , Genetics 

15 139: 1247-1259 (1995) ; Lakowski and Hekimi, Science 
272: 1010-1013 (1996); Ewbank et al . , Science 275: 980- 
983 (1997) ) . 

In addition to the various phenotypes discussed 
above, miaA mutations increase the frequency of sponta- 

20 neous mutations (Connolly and Winkler, J Bacterid 
173(5): 1711-21 (1991); Connolly and Winkler, J Bacte- 
rid 171: 3233-46 (1989)). As described in the previ- 
ous section we have preliminary evidence that 
gro-1 (e2400) also increases the frequency of 

25 spontaneous mutations in worms. 

How can the alteration in the function of MDAPP 
transferase result in so many distinct phenotypes? 
Bacterial geneticists working with miaA have generally 
suggested that this enzyme and the tRNA modification it 

30 catalyzes have a regulatory function which is mediated 
through attenuation (e.g. Ericson and Bjork, J. Bacte- 
rid. 166: 1013-1021 (1986)). Attenuation is a phe- 
nomenon by which the transcription of a gene is inter- 
rupted depending on the rate at which ribosomes can 

35 translate the nascent transcript. Ribosomal transla- 
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tion is slowed in miaA mutants, and thus, through an 
effect on attenuation, could affect the expression of 
many genes whose expression is regulated by attenu- 
ation. 

5 gro-1 (e2400) also produces pleiotropic effects 

and, in addition, displays a maternal-effect, suggest- 
ing that it is involved in a regulatory process (Wong 
et al.r Genetics 139: 1247-1259 (1995). However, 
attenuation involves the co-transcriptional translation 
10 of nascent transcripts, which is not possible in 
eukaryotic cells were transcription and translation are 
spatially separated by the nuclear membrane. If the 
basis of the pleiotropy in miaA and gro-1 is the same, 
then a mechanism distinct from attenuation has to be 
15 involved. Below we argue that this mechanism could be 
the modification by DMAPP transferase of adenine resi- 
dues in DNA in addition to modification of tRNAs . 
A role for gro-1 in DNA modification? 

We observed that gro-1 can be rescued by a 
20 maternal effect, so that adult worms homozygous for the 
mutation, but issued from mother carrying one wild type 
copy of the gene display a wild type phenotype, in 
spite of the fact that such adults are up to 1000 fold 
larger than the egg produced by their mother. It is 
25 unlikely that enough wild type product can be deposited 
by the mother in the egg to rescue a adult which is 
1000 times larger. This observation suggests therefore 
that gro-1 can induce an epigenetic state which is not 
altered by subsequent somatic growth. One of the best 
30 documented epigenetic mechanisms is imprinting in mam- 
mals (Lalande, Annu Rev Genet 30: 173-196 (1996)) which 
is believed to rely on the differential methylation of 
genes (Laird and Jaenisch, Annu Rev Genet 30: 441-464; 
Klein and Costa, Mutat Res 386: 103-105 (1997)). Modi- 
35 fication of bases in DNA have also been linked to regu- 
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lation of gene expression in the protozoan Trypanosoma 
brucei. The presence of beta-D-glucosyl-hydroxy- 

methyluracil in the long telomeric repeats of T. brucei 
correlates with the repression of surface antigen gene 
5 expression (Gommers-Ampt et al . , Cell 75: 112-1136 
(1993); van Leeuwen et al . , Nucleic Acids Res 24: 
2476-2482 (1996) ) . 

gro-1 and mlaA increase the rate of spontaneous 
mutations, which is generally suggestive of a role in 

10 DNA metabolism, and can be related to the observation 
that methylation is linked to spontaneous mutagenesis, 
genome instability, and cancer (Jones and Gonzalgo, 
Proc. Natl. Acad. Sci. USA, 94: 2103-2105 (1997)). 

Does gro-1 have access to DNA? Studies with 

15 mod5, the yeast homologue of gro-1, have shown that 
there are two forms of Mod5p, one is localized to the 
nucleus as well as to the cytoplasm, and the other form 
is localized to the mitochondria as well as the 
cytoplasm (Boguta et al . , Mol . Cell, Biol. 14: 2298- 

20 2306 (1994)). The nuclear localization is striking as 
isopentenylation of nuclear-encoded tRNA is believed to 
occur exclusively in the cytoplasm (reviewed in Boguta 
et al. f Mol. Cell. Biol. 14: 2298-2306 (1994)). 
Furthermore, studies of a gene mafl have shown that 

25 when mod5 is mislocalized to the nucleus, the 
efficiency of certain suppressor tRNA is decreased, an 
effect known to be linked to the absence of the tRNA 
modification (Murawski et al., Acta Biochim. Pol. 41: 
441-448 (1994)). Finally, as described in the previous 

30 section, gro-1 contains a zinc finger, a nuclei acid 
binding motif. The zinc finger could bind tRNAs, but 
as it is in the C-terminal domain of gro-1 and human 
hgro-1 that has no equivalent in mlaA, it is clearly 
not necessary for the basic enzymatic function. We 

35 speculate that it might be necessary to increase the 
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specificity of DNA binding in the large metazoan 
genome. It should also be noticed that the second form 
of ModSp which is localized to mitochondria also has 
the opportunity to bind and possibly modify DNA as it 
5 has access to the mitochondrial genome. See the 
previous section entitled "A role for gro-1 in a 
central mechanism of physiological coordination 7 ' for an 
alternative possibility as to the function of GRO-1 in 
the nucleus. 

10 miaA and gro-1 are found in complex operons 

We have found that gro-1 is part of a complex 
operon of five genes (Fig. 4) . It is believed that 
genes are regulated coordinately by single promoters 
when they participate in a common function (Spieth et 

15 ai., Cell 73: 521-532 (1993)). In some cases, this is 
well documented. For example, the proteins LIN-15A and 
LIN-15B which are both required for vulva formation in 
C. elegans, are unrelated products from two genes tran- 
scribed in a common operon (Huang et al . , Mol Biol Cell 

20 5(4): 395-411 (1994)). One of the genes in the gro-1 
promoter is hap-1, whose yeast homologue has been shown 
to be involved in the control of mutagenesis (Nostov et 
al., Yeast 12: 17-29 (1996)). Under the hypothesis 
that gro-1 modifies DNA, it suggest an involvement of 

25 hap-1 in this or similar processes . The presence in 
the same operon also suggest that all five genes might 
collaborate in a common function. The phenotype of 
gro-1 suggests that this function is regulatory. In 
this context, it should be noted that mlaA also is part 

30 of a particularly complex operon (Tsui and Winkler, 
Biochimie 76: 1168-1177 (1994)), although, except for 
miaA/ gro-1 , there are no other homologous genes in the 
two operons. 
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A role for gro-1 in a central mechanism of physiologi- 
cal coordination 

We have speculated that the genes with a Clk 
phenotype might participate in a central mechanism of 
5 physiological coordination, probably including the 
regulation of energy metabolism, clk~l encodes a 

mitochondrial protein (unpublished observations) , and 
its homologue in yeast has also been shown to be 
mitochondrial (Jonassen, T (1998) Journal of Biological 

10 Chemistry 273:3351-3357). The yeast clk-1 homologue is 
involved in the regulation of the biosynthesis of 
ubiquinone (Marbois, B.N. and Clarke, C.F. (1996) 
Journal of Biological Chemistry 271:2995-3004) . 
Ubiquinone, also called coenzyme Q, is central to the 

15 production of ATP in mitochondria. In worms, however, 
we have found that clk-1 is not strictly required for 
respiration. How might gro-1 fit into this picture? 

One link is that dimethylallyldiphosphate is 
known to be the precursor of the lipid side-chain of 

20 ubiquinone. In bacteria, ubiquinone is the major lipid 
made from DKAPP „ In eukaryotes cholesterol and its 
derivatives are also made from DMAPP . Interestingly, 
C. elegans requires cholesterol in the growth medium 
for optimal growth. This link, however, remains tenu- 

25 ous, in particular in the absence of an understanding 
of the biochemical function of CLK-1. 

In several bacteria, the adenosine modification 
carried out by DMAPP transferase is only the first step 
in a series of further modification of this base 

30 (Persson et al., Biochimie 76: 1152-1160 (1994)). 
These additional modifications have been proposed to 
play the role of a sensor for the metabolic state of 
the cell (Buck and Ames, Cell 36: 523-531 (1984); 
Persson and Bjork, J. Bacterid . 175: 7776-7785 

35 (1993)). For example, one of the subsequent steps, the 
synthesis of 2-methylthio-cis-ribozeatin is carried 
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out by a hydroxylase encoded by the gene miaE. When 
the cells lack miaE they become incapable of using 
intermediates of the citric acid cycle such as fumarate 
and malate as the sole carbon source . 
5 Another link to energy metabolism springs from 

the recent biochemical observations of Winkler and co- 
workers using purified DMAPP transferase {E. coli 
MiaAp) (Leung et al., J Biol Chem 272: 13073-13083 
(1997)). These investigators observed that the enzyme 

10 in competitively inhibited by phosphate nucleotides 
such as ATP or GTP. Furthermore, using their estimation 
of K m of the enzyme and its concentration in the cell, 
they calculate that the level of inhibition of the 
enzyme in vivo, would exactly allow the enzyme to mod- 

15 ify all tRNAs but any further inhibition would leave 
unmodified tRNAs . This suggests that the exact level 
of modification of tRNA (or of DNA) could be exqui- 
sitely sensitive to the level of phosphate nucleotides. 
Superficially, this is consistent with the phenotypic 

20 observations. The state of mutant cells which lack 
DMAPP transferase entirely would be equivalent of cells 
where very high levels of ATP would completely inhibit 
the enzyme. Such cells might therefore turn down the 
ATP generating processes in response to the signal pro- 

25 vided by undermodif ied tRNAs (or DNA) . 

More generally, GRO-1 could act in the crosstalk 
between nuclear and mitochondrial genomes. The nuclear 
and mitochondrial genomes both contribute gene products 
to the mitochondrion energy-producing machinery and 

30 these physically separate genomes must therefore 
exchange information somehow to coordinate their 
contributions (reviewed in Poyton, R.O. and McEwen J.E. 
(1996) Annu. Rev. Biochem. 65:563-607). Furthermore, 
the energy producing activity of the mitochondria is 

35 essential to the rest of the cell, and the needs of a 
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particular cell at a particular time must be somehow 
convey to the organelle to regulate its activity. GRO-1 
could participate in this coordination in the following 
manner. GRO-1 is found in three compartments, the 
5 nucleus, the cytoplasm and the mitochondria (see 
above), and thus has the opportunity to regulate gene 
expression in more that one way. How could its action 
coordinate gene expression between compartment? GRO-1 
could partition between the mitochondria and the 

10 nucleus and its relative distribution could be 
determined by the amount of RNA (or mtDNA) in the 
mitochondria (Parikh, V.S. et al . (1987) Science 
235:576-580) . For example, if the cell is rich in 
mitochondria, much GRO-1 will be bound there which 

15 could result in a relative depletion of activity in the 
cytoplasm with regulatory consequences on the 
translation machinery. Binding of GRO-1 in the nucleus 
could have similar consequences and provide information 
about nuclear gene expression to the translation 

2 0 machinery. 

While the invention has been described in con- 
nection with specific embodiments thereof, it will be 
understood that it is capable of further modifications 
and this application is intended to cover any varia- 

25 tions, uses, or adaptations of the invention following, 
in general, the principles of the invention and 
including such departures from the present disclosure 
as come within known or customary practice within the 
art to which the invention pertains and as may be 

30 applied to the essential features hereinbefore set 
forth, and as follows in the scope of the appended 
claims . 
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WHAT IS CLAIMED IS : 

1 . A gro-2 gene which has a function at the level 
of cellular physiology involved in developmental rate 
and longevity, wherein gro-1 is located within an 
operon and gro-1 mutants have a longer life and a 
altered cellular metabolism relative to the wild-type. 

2. The gro-1 gene of claim 1, wherein said operon 
has the nucleotide sequence set forth in SEQ ID. NO:l. 

3. The gro-1 gene of claim 1, which codes for a 
GRO-1 protein having the amino acid sequence set forth 
in Figs. 3A-3B (SEQ ID. NO:2). 

4. A gop-1 gene which codes for a GOP-1 protein 
having the amino acid sequence set forth in Figs. 13A- 
13C ( SEQ ID. NO: 4) . 

5. A gop-2 gene which codes for a GOP-2 protein 
having the amino acid sequence set forth in Fig. 14 
(SEQ ID. N0:5) . 

6. A gop-3 gene which codes for a G0P-3 protein 
having the amino acid sequence set forth in Figs. 15A- 
15B (SEQ ID. NO: 6) . 

7. A hap-1 gene which codes for a HAP-1 protein 
having the amino acid sequence set forth in Fig. 16 
(SEQ ID. N0:7) . 

8. The gro-1 gene of claim 1, wherein said gene is 
of human origin and which has the nucleotide sequence 
set forth in Fig. 8 (SEQ ID. NO:3). 
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9. A GRO-1 protein which has a function at the 
level of cellular physiology involved in developmental 
rate and longevity, wherein said GRO-1 protein is 
encoded by the gene of claim 1 or 2 . 

10. A mutant GRO-1 protein which has the amino acid 
sequence set forth in Fig. 3C . 

11. A GRO-1 protein which has the amino acid 
sequence set forth in Figs. 3A-3B (SEQ ID. NO:2). 

12. A GOP-1 protein which has the amino acid 
sequence set forth in Figs. 13A-13C (SEQ ID. NO:4). 

13. a GOP-2 protein which has the amino acid 
sequence set forth in Fig. 14 (SEQ ID. NO:5). 

14. A GOP-3 protein which has the amino acid 
sequence set forth in Figs. 15A-15B (SEQ ID. NO:6). 

15. A HAP-1 protein which has the amino acid 
sequence set forth in Fig. 16 (SEQ ID. NO: 7) . 

16. A method for the diagnosis and/or prognosis of 
cancer in a patient, which comprises the steps of: 

a) obtaining a tissue sample from said patient; 

b) analyzing DNA of the obtained tissue sample of 
step a) to determine if the human gro-1 gene is 
altered, wherein alteration of the human gro-1 gene is 
indicative of cancer. 

17. A mouse model of aging and cancer, which com- 
prises a gene knock-out of murine gene homologous to 
gro-1 gene of claims 1 to 3. 
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18. The use of compounds interfering with enzymatic 
activity of GRO-1 of claim 9, 10 or 11 for enhancing 
longevity of a host. 

19. The use of compounds interfering with enzymatic 
activity of GOP-1 of claim 12 for enhancing longevity 
of a host. 

20. The use of compounds interfering with enzymatic 
activity of GOP-2 of claim 13 for enhancing longevity 
of a host. 

21. The use of compounds interfering with enzymatic 
activity of GOP-3 of claim 14 for enhancing longevity 
of a host. 

22. The use of compounds interfering with enzymatic 
activity of HAP-1 of claim 15 for enhancing longevity 
of a host. 

23. The use of compounds interfering with enzymatic 
activity of GRO-1 of claim 9, 10 or 11 for inhibiting 
tumorous growth. 

24. The use of compounds interfering with enzymatic 
activity of GOP-1 of claim 12 for inhibiting tumorous 
growth. 

25. The use of compounds interfering with enzymatic 
activity of GOP-2 of claim 13 for inhibiting tumorous 
growth. 



26. The use of compounds interfering with enzymatic 

activity of GOP-3 of claim 14 for inhibiting tumorous 
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growth. 

27. The use of compounds interfering with enzymatic 

activity of HAP-1 of claim 15 for inhibiting tumorous 
growth . 
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SL2 MIFRKFLNFLKPYKMR 16 

aaaatatcgtraggaaataata^ 1394 

T D P I I F V I G C T G T G K S D L G V A I A K K 1 6 G E V I S V 49 

GAACGGATCCGATTATTTTCGM m 

1 SHP109 

D S H Q F I K G LDIATHKIT 66 

AGATTCAATGCWTTTATAMGgtacatggqttttgtttcaattttaaattaattaattttcgtttttcagGACTTGAOTTGCCACXMTAAGATAAC 1594 



EEESEGIQHHHMSFLNPSESSSYNVHSFREVTL 99 

GGMGAAGAATCTGMGGGAnCAACATCATATGATGTCATTTTTGMTCCATCTGAATCATCATCTTATMra^ 1694 

SHP94 1 

D L I K K I R A R S R I P V I V G 116 

GATCTTATTAAAgtgcttaattcgccactttttga 1794 

' SHP85 

GTTYYAESVLYENNLIETNTSDDVDSKSRTSSE 149 

GAGGMCCACTTATTATGCTGAAAGTGTCCffl^ 1894 

SHP9S f 

S S S E D T E E G I S I Q E L I D E L K K I D E R S A L L L H P N 182 

ATCGTCATCTGAAGACACTGMGAAGGAATTAGTMTCAAGMTTATGOTGAATTG 1994 
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po-] continued.., 4/32 

NRYRVQRALQIFRETG 198 
AATCGTTATCGAGTACAGAGAGCATTGCAMTTTTCAGAGRMCTGgtaattgatttgcaaatttccagattaaaaacaaatcaagtaaagttttttgca 2094 



IRKSELVEKQKSDETVDLGGRLRFDNSLVIFMD 231 

gGAATCCGAAAAAGTGAACTTGTTGAAAAACAGAAATCAGATGAAACTGTTGATTTGGGTGGACGACTACGATTTGATAATTCTTTAGTTATTTTTATGG 2194 
SHP97 ? 

ATPEVLEERLDGRVDKMIKLGIKNELIEFYNE 263 

ATGCAACACCTGAAGTTTTAGMGAAAGACTTGATGGAAGAGTTGATAAAATGATTAAATTGGGTTTGAAGfiATGAATTGATCGAGTTTTATAACGAGgt 2294 



aaatatttgaatttttccagaaaaaaaaagaaaattttttattattttgtttttttttcattctttactattttccaaaaaagtttaaacttttgaaaac 2394 



H A E Y 267 

tgttcagaaaatgttcgtgtatttattttagcttactgaggcattatttcattgtgatttttactatactctataaactaaattttcagCACGCCGAGTA 2194 

INHSKYGVMQCIGLKEFVPWLNLDPSERDTLNG 300 

CATMCAC&GCAMTATGGTGTCATGCMGTO 2594 

D K L F K Q G CDDVKLHTRQY 318 

GATAAATTGTTCAAGCAAGGgtaatttaaatttattttcaatttttataaattccaagctattttcagATGCGATGATGTGAAGCTTCACACTCGACAAT 2694 
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ARRQRRWYRSRLLKRSDGDR 33 
ATGCRCGGC<K3CAGA6ACGGTGGTATCGATC6AGACTTTTAAMCGGTCGGATGGTGATCGGgtatgttgattttaaaaaaattgaatttttaaagaact 219 

tttttactaaattaacaaagttattggctgaaaatggctgaaaattatagtaaaactaatcaaaaaaattgaaattttgaattaaagtcataaagtgacg 289 

KMASTKMID 34 

accagaaaattaaaaaaaaacatttttctattttaattaattcactctacttcactttaaaaataattttcagAAAATGGCAAGTACAAAAATGCTGGAT 299 

TSDKYRIISDGMDIVDQWMNGIDLFED 37 
ACftTCTGACAAGTACCGAATAATTAGTGATGGAATGGACATTGTTGATCAATGGftTGAATGGAATCGATCTATTTGAAGATgtaaaatttcacaaattct 309 

I S T D T N P I I K G S D A S I L L » C E I 39 
aaaatttccgaatcacaaattaaaatttctacagATCTCCACAGACACCAATCCAATTCTAAAAGGGTCCGATGCAAATATTCTGCTGAATTGTGAAATC 319 

CNISMTGKDHW QKEIDGKK \\ 

TGTMTATTTCAATGACTGGAAM(^TAATTGgtttgtttcaatacatattataatttcgaaatgaattttttcagGCAGAMCATATCGATGGGAAAM 329 
SHP110 T T SHP100 

II K B H A K Q K K L A E T R T . 43 

GCACAAGCAICATGCTAAGCAAAAGAAATTGGCAGAGACTCGCACAtaaqacgctatatttattttttgttaacttaaattatttttgttgttgattgtt 339 

polyA 

r 

ctctaaataaaaaaacagctcagagagaagattaggcgctcgtccacatctccgacgatagtcaacccgaacgaagggaactatctttaattgtcagtga 349 

' SHP92 

"h==r--3C 
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tgatttttactatactctataaactaaattttcagCACGCCGAGTACATAMTCACAGCAAATATGGTGTCACG 1197 

HAEYINHSKYGVT 276 

TTGGTCnmGAATTCGmCATGGCTCMmACCCAramMTACACTCMTOGATAAAmT 1272 

L V I I N S F H G S I « T H Q I » I H S » G I H C 301 

TCAAGCAAGGgtaatttaaatttattttcaatttttataaattccaagctattttcagATGCGATGATGtgaagcttc 1350 

3 8 I D A H H • 308 
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gop-1 gop-2 gop-J hop-] gro-1 

t- CM r- CM CM — CN <- CM 
I I I — I — I — I — I — I — I — I 

cncnc/ioototococototo 



925 bp - 
421 bp 
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Sequence of GRO-1 and homologues 



i it ii \ i 1 1 i iii i 

Celegaas 1 MIFRRFLNFLKPYKMRTDPIIFVI6CTGTGKSDLGVAIAKKYGGEVISVDSHQFYK6LDIATNKITEEESEGIQ 
S.cerevisiae i MLKGPLKGCLNMSKKVIVIAGTTGVGKSQLSIQLAQKFNGBVIHSDSMQVTKDIPIITNKHPLQBREGIP 
B.coji i MSDISKASLPKAIFLMGPTASGKTALAIELRKILPVELISVDSALIYKGHD1GTAKPNAEELLAAP 

ATP/GTP 
binding site 



C.elejans « BHMSFLNPSESSSYNVHSFREVTLDLIKKIRARSKIPVIVGGTTYYAESVLYENNLIETNTSDDVDSKSRTSSE 
S.cerevisiae v HVMKHVDWSE--EYYSHRFETECHNAIEDIHRRGKIPIWGGTHYYLQTLFNKRVDTKSSERKLTRKQLDILES 
E.coJi 9 RLLDIRDPSQ--AYSAADFRRDALAEHADITAAGRIPLLVGGTMLYFKALLEGLSPLPSADPEVRARIEQQAAE 



C.elegans isi SSEDTEEGISNQELWDELKKIDEKSALLLHPKHRYRVQRALQIFRETGIRKSELVEKQKSDETVDLGGRLRFDN 

S.cerevisiae i« DPDV IYKTLVKCDPDIATKYHPNDYRRVQRMLEIYYKTGKKPSETFHEQK ITLKFD-' 

!iCO ii m 6NES LHRQLQEVDPVAAARIHPNDPQRLSRALEVFFISGKTLTELTQTSG DALPYQV 
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m 

i i ♦ » • » • 

C.elegaea 226 LVIFMDATPEVLEERLDGRVDKMIKLGLKNELIEFYNEHAEYIKHSKYGVMQCIGLKEFVPWLNLDPSERDTLN 
S.cemisiae m LFLWLYSKPEPLFQRLDDRVDDMLERGALQEIKQLYEYYSQNKFTPEQCENGVWQVIGFKEFLPWLTGKTDDNT 

B. coli 202 QFAIAPASRELLHQRIEQRFHQMLASGFEAEVRALFARGDLHTDLPSIRCVGYRQHWSYLEGEISYDEMVYRGV 

I • I 1 « • I t 

C. elegans Jm DKLFKQGCDDVKLHTRQYARRQRRWYRSRLLKRSDGDRKMASTKMLDTSDKYRIISDGHDIVDQWHNGIDLFED 
S.cmrisht 280 KLEDC I ERMKT - - RTRQYAKRQVKWIKKMLI PD IKG DILLDATDL SQWDTKASQRAI AISHDF ISNRPIKQERA 
EiCoIi m —- ATRQLAKRQITWLRGWEGVHWLDSEKPEQARDEVLQVVGAIAG 

, C2H2 zinc finger . 

C.elegans « gTHTWP IT.KflSDANTLLM C EICH I SMTGKDNWOKHI DGKKHK HHAKQKKL ATRT 

s'.cerevime 353 KALEELLSKGETTHKKLDDWTHYTRNVCRNADGKNWAIGEKYWKIHLGSRRHKSNLKRNTRQADFEKWKINKK 

1 

+ x=7 - SB 
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Sequence of HAH and its homolocnies 



H, sapiens MAASLVGKRIVFVTGNAKKLEEWQILGDKFP CTLVAQKIDLPEYXG- EPDEI S I QKCQE 

C, elegans MLYILWKLNYLQKKMSLRKINFVTONVKKLEEVKAILKKFE VSNVDVDLDEFQG-EPEFIAERKCRE 

fl, cerevisiae MSNNEIVFVTGNAHKLKEVQSILTQEVDNNNKTIHLINEALDLEELQDTDLNAIALAKGKQ 

I. coll MQKWLATGNVGXVRELASLLSDFGLD IVAQTDLGVDS AEETGLTF IENAILKA 



H, sapiens AVRQV-QG-PVLVEDTCLCFHALGXLPGPyiKMFL--EKLKPEGLHQLLAGFED KSAYALCTFALSTGDP 

C, elegans AVEAV-KG-PVLVEDTSLCFHAMGGLPGPYIKWPL--KNLKPEGLHNMLAGFSD KTAYAQCIFAYTEG-L 

s, cerevisiae avaalgrgkpvfvedtalrfdefnglpgayikwfl-ksmglekivkmlepfen knaeavtticfadsrg 

e, coli rhaakvtalpaiaddsglavdvlggapgiysarysgedatdqknlqklletmkdvpddqrqarfhcvlvylrhae 



h, sapiens sqpvrlfrgrtsgriv-aprgcqdfgwdpcfqp-dgyeqtyaempfaeknavshrfrallelqeyfgslaa 

c. elegans gkpihvfagkcpgqiv-aprgdtafgwdpcfqp-dgfketfgehdkdvkneishrakalellkeyfqnn 

s. cerevisiae e-yhffqgitrgkiv-psrgpttfgwdsifepfdshgltyaemskdaknaishrgkafaqfkeylyqndf 

I, coli bptplvchgswpgvitrepagtggfgydpiffv-psegktaaeltreeksaishrgqalkllldalrng 
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mRNA sequence of human homologue of gro-1: hgro-1 







TGGCGGCTGC 


ACGAGCAGTT 


CCTGTGGGCA 


{j 1 LjLjoC 1 \^t\KJ 




rGGACCCTAC 


CTCTTGTAGT 


GATTCTCGGG 




rmnr a a atp 


r acgctggcg 


TTGCAGCTAG 


GCCAGCGGCT 


Lbblbb 1 brtb 






GCAGGTCTAT 


GAAGGCCTAG 


ALA 1 LH 1 bAL 


C* A APAAPtPtTT 


TCTGCCCAAG 


AGCAGAGAAT 

X x x X V — J X X x *• 


CTGCCGGCAC 


bALA 1 oA 1 U A 


PPTTTPTPPA 


TCCTCTTGTG 

X \j \ • X vy X X W X \_) 


ACCAATTACA 


CAGTGGTGGA 


b i 1 L- AbjAAA 1 


APAPPAAPTH 


x x vin x x v_j£t. 


AGATATATTT 

X 1>J4 ^ X Xi X A 4 X X. «J- 


GCCCGAGACA 


AAA 1 IbblAi 


TPTTPTPPPA 


HGAACCAATT 


ATTACATTGA 


ATCTCTGCTC 


1 CjUAAAAjj lib 


TTPTPAATAP 


r a agcctcag 


GAGATGGGCA 


CTGAGAAAGT 


r~* 7\ r P r Pf , 7\f , '~ , P*P , A 
bAl ibAbbbA 


AAAPTPPAPP 


TTPAAAAGGA 


GGATGGTCTT 


GTACTTCACA 




PPAPPTPPAP 


PPAQAAATHG 


CTGCCAAGCT 


GCATCCACAT 


r* A r A A A C CiC A 


AAGTGGCCAG 


GAGCTTGCAA 


GTTTTTGAAG 


AAACAGGAAT 


CTCTCATAGT 


GAATTTCTCC 


ATCGTCAACA 


TACGGAAGAA 


GGTGGTGGTC 


CCCTTGGAGG 


TCCTCTGAAG 


TTCTCTAACC 


CTTGCATCCT 


TTGGCTTCAT 


GCTGACCAGG 


CAGTTCTAGA 


TGAGCGCTTG 


GATAAGAGGG 


TGGATGACAT 


GCTTGCTGCT 


GGGCTCTTGG 


AGGAACTAAG 


AGATTTTCAC 


AGACGCTATA 


ATCAGAAGAA 


TGTTTCGGAA 


AATAGCCAGG 


ACTATCAACA 


TGGTATCTTC 


CAATCAATTG 


GCTTCAAGGA 


ATTTCACGAG 


TACCTGATCA 


CTGAGGGAAA 


ATGCACACTG 


GAGACTAGTA 


ACCAGCTTCT 


AAAGAAAGGA 


CCTGGTCCCA 


TTGTCCCCCC 


TGTCTATGGC 


TTAGAGGTAT 


CTGATGTCTC 


GAAGTGGGAG 


GAGTCTGTTC 


TTGAACCTGC 


TCTTGAAATC 


GTGCAAAGTT 


TCATCCAGGG 


CCACAAGCCT 


ACAGCCACTC 


CAATAAAGAT 


GCCATACAAT 


GAAGCTGAGA 


ACAAGAGAAG 


TTATCACCTG 


TGTGACCTCT 


GTGATCGAAT 


CATCATTGGG 


GATCGCGAAT 


GGGCAGCGCA 


CATAAAATCC 


AAATCCCACT 


TGAACCAACT 


GAAGAAAAGA 


AGAAGATTGG 


ACTCAGATGC 


TGTCAACACC 


ATAGAAAGTC 


AGAGTGTTTC 


CCCAGACTAT 


AACAAAGAAC 


CTAAAGGGAA 


GGGATCCCCA 


GGGCAGAATG 


ATCAAGAGCT 


GAAATGCAGC 


GTTTAAGAGA 


CATGTCCAGT 



GGCCTTTGGA AAGGTGGTGG GGATCCAGTT CAGGAGGGAG GGGTATGTTT 
GTCTCCCAGT CTGGGCAAAG GAGTGCTATG CGGAATTCTC TGCATAGCAG 
AAAAGCTCCC ACCATTTTCT TTTGATGTGG TTTTAAAGTC TCACGTTCTC 
TATAATAGAA ACAGCAGGTC TTGTCAGCTC CTTGTGTGGC TGATGTGTCT 
GGAAATGATG TAGTTCAGGA AAGCATTTTT TTTTTCTTTG AACCTTAAAG 
GTTCTATTAT TAAAAGCAGC ACAGATTCCA CATTTTTATA CATGAGGATC 
TTCTTTGTGG TGAATACCAG GATTGACTGC ATCCCTTTAA AAGAAGTTTT 
ATGTCCCTGA CTCTGGCTAA AATTATCTAA TTTCCAGATG CTTTTGTAGA 
TGACTGAAGT ATTTGTGAGC CACATATTGG GAGTTCTAGA TTTGAGTGAA 
TGGCAGGAAA GGGCCATCTC CATTGAGATG ATTAAGTGAA CCAAACTAGT 
TCTCGGAATT CTACAGAGAA GGAGGGAATC AGACTGAGGA AGCTGTGACA 
TAGGACTTGA AGACCAAAGA CTTTGAAATT TGCGAGCTGC TCATGTGTGA 
GTTATTATCA CTGCTGTCTT TCTATTGAGT TACAAATCTA TATTTTTATT 
GAAGTTTAAA TAAAGAAAAA ATTTACAAGA AAAAAAAAAA A 

j-j:=T -_B 
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GRO-1 and its human homologue hgro-1p 



M • I MMH I III I till I 1 1 1 1 I I 



(30-1 IMmKPYMTDPM 



i ♦ in ii i i i ii 1 1 i it in mi ii ii i 

00-1 HTEEESMPiSFWSESSSM™ 



I | | | I I I I II I I II • MM 



tpj-lp TKPQEMGTEKVIDRKVELEKEDGD7 mSQVDPEMAAKLHPlKRKVARSI^WEETCISH 
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II | | Mill I II Mil Ml I II M I 

hgro-lp SEFLHRQHT^^ 

gr>i w^MmMm^^^Mmmmm^—wm 



I I I II Ml I • • 

hgro-lp SMSQDYQHGIFQSIGFKEFHEYLITEGKCTLETSNQLLKRGPGPIVP 

ffl>l YIlSKY-^IGIiKEFVmDPSEm 



II 



hgro-lp VSDVSMMEPALEWQSnQGHKPTATPIMlEMR» 

GRH RSDGDMSm 



II i i I I I l I 



hgro-lp mi rnRTTT ^DREI/^IKSKSHLNOIMI ^DSDAVM'I esqsvspdymkepkgkgspgqwdqelkcsv 



C2H2 zinc finger 

-t-==r - SB 
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Structure of pMQ8 



Sad 

gtacgtglgagctck 

\ SHP161 

I _ ► 

atcgtgttccaggtg^aactatatattgagcaggaggacgagttgtttgtttcatgctgcttaaaaataaaaatg 

I ► 

, J SHP151 
cagcgagclgca^, 

SphI 

gaaaattgagtcaaaaagttgagataaaacaaattaaaacaattttctgaaaaataaacaactgaaatttgaagtaataaacaacacgcgaaaacgttat 



ttcggagcatcgtttgagaagtaaaactttttttcggcgcacccttgtgcgcagtttttatcttctcttttaatttaattttcaagctaaatctttcttt 



promoter 



ttaaactttq 



gro-1 



SHP159 MIFRKFLNFIKPYKMR 



jaataaatatttaaatattcag itataccctgaactctacagtttRTGmTTCAGGMTTTCTGMTTTTCTGMCTTACAmTGC 



SHP160 

T D P I I F V I G C T G T G K S D L G V A I A K K Y G G E V I S V 
GMCGGAMATTAHnCGTGATTGGGTGCACTO 
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D S H Q F y K G L D I A T I . 

AGATTCAATGCAATTTTATAAAGgtacatgggttttgtttcaattttaaattaattaattttcgtttttcagGACTTGACATTGCCACGAAT. 



B A K Q K K L \ I T R T • 

Ma _ 



.CATGCTMGCAAMGA AATTGGCAGAGACTCGCACAt aaqacqctatatttattttttgttaacttaaattatttttgttgttgattgtt 

1 shpho 

■|tctaga]tatact 
Xbal 



ctctaaataaaaaaacagctcagagagaagattaggcgctcgtccacatctccgacgatagtcaacccgaacgaagggaactatctttaattgtcagtga 

SHP162 V 

^- [ctgcagltgtcat 

PstI 



-h==r - 7 7ft 
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Construction of pMQ1 8 



SHP151 



SHP170 



a a a a. a y\ a a" SHP15I5HP170 

Ezdyb = =A= = /\z/b / ^tAz/b PCR product amplified 

promoter gro-1 from pMQ8 




pPD95.77 



gfp unc-54 3" UTR 



gro-1 gfp 200 aa 

GRO-1 ::GFP Fusion Protein 
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atcgtgttccaggtgcaactatatattgagcaggaggacgagttgtttgtttcatgctgcttaaaaataaaaatggaaaattgagtcaaaaagttgagat -9557 

aaaacaaattaaaacaattttctgaaaaataaacaactgaaatttgaagtaataaacaacacgcgaaaacgttatttcggagcatcgtttqagaagtaaa -9451 

actttttttcggcgcacccttgtgcgcagtttttatcttctcttttaatttaattttcaagctaaatctttctttttaaactttgaataaatatttaaat -9351 

H F R K L G S S G S L I K P K I P H S I E 21 

attcagaatgcaccaataaacctggaacaaaatcgata ATGTTCCGCAAGCTTQGTTCT TCTG^TCACTATGGRAGCCGAAAAATCCGCATTCTlTGGA -9251 

SHP190 1 

Y L K Y L Q G V L T K N E K V T E I H I I I L V E A L R A I A E I 54 

ATACCTCAMTATTTACMGGAGTGCTCACAAAAMTGAGAMGTTACGGAAAACAATAAGAAAATATTAGTAGAAGCATTACGAGCTATCGCAGAAATT -9157 

L 1 W G D Q N D A S V F D F F L E R 72 

CTCATTTGGGGCGATCAGAATGATGCTTCGGTTTTTGAgtgagtttttttccaatgttttttttcaaatctgatgttgaatttcagnTCTTCCTTGAGC -9057 

QNLLYFLKIMEQGNTPLNVQLLQTLNILFENIR 105 

GGCAMTGCTTCTTTATTTCTTGAMTTATGGAACAAG^ -8957 

T SHP171 

H E T s L 1 FLLSNHHVNSII 123 

ACATGAAACTTCACTTTgtaagttttttatatggattttcgcttaaaattgccagttttcagATTTCCTTCTAAGTAACAATCATGTAAACTCGATTATT -8857 

S B K F D L Q H D E I M A Y Y I S F L K T I S F K L N P A T I H F F 157 

TCCCACAAATTCGATTTACAAAATGATGAGATCATGGCTTACTACATTAGTTTTCTGAAAACTCTTTCATTTAAACTGAATCCAGCTACMTCCACTTCT -8757 
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F-NETTEEFPLLVEVLKLYHWNESMVRIAVRNIL 190 

TCTTCAATGAMCGACTGAAGAATTT CCATTGTTGGTAGAAGTTTT GWGCTTTA TMTTGGMTGAATCAATGCT TCG?^TTGCTGTTAGMRTATTCT -8651 

r ~SSS ' 

L N I V R V Q D 0 S H I I F A I K H T K 210 

TTTAAATATTGTGAGAGTTCAAGATGATTCAATGATTATTTTCGCTATCAAGCATACAAAAgttagtagaaaattattttgaaaaggtgtatttaagcaa -8551 

EYLSELIDSLVGLSLEMDTFVRSAENVLAN 240 

taaatattacagGMTATCTATCGGAGTTMTAGAnCTCTAGTTGGTCTCTCACTTGAAATGGACACATTTGTACGATCTGCTGAGAATGTGTTAOT -8457 

R E R L R 6 K V D D L I D L I H Y I G E L L D V E A V A E S L S I 273 

ATCGAGAGAGATTACGA GGAAAAGTGGATGATTTAATT GATTTGATTCATTATAT'IGGT GAACTATTGGATGTGGAAGCTGTCGCCGAAAGTTTATCAAT -8357 

SHP142 SHP173 ^ 

L y TTRYLSPLLLSSISPR 291 

TTTAGgtcagttttactgctggaaaatcaagtttttaatgttaaattttcagTAACAACACGATACTTAAGCCCTCTATTACTTTCAAGTATATCACCAA -8257 



RDNHSLLLTPISALFFFSEFIL 313 

GAAGAGATAATCATTCACTTCTACTCACTCCGATTTCTGCGTTATTTTTTTTCTCTGAATTTTTATTGgtgagttttaacatttaaaattacatttttct -8157 

I V R H H E I I Y T F L S S F L F D T Q N T I T T H W I 341 

aatttatttatttttcagATAGTTCGTCACCATGAAACAATATATACATTTTTATCATCTTTCCTATTTGACACTCAGAATACTTTGACGACCCATTGGA -8057 

RHNEKYCLEPITLSSPTGEYVNEDH 366 

TACGTCATAATGAGAAATATTGCTTAGAACCGATTACATTATCATCACCAACCGGAGAATATGTGAATGAAGACCAgtaagagctgaaattttaaaattt -7957 

VFFDFLLEAFDSSQADDSKAFYGLH 391 

ttgctttgaatatagtattttcagCGTATTTTTCGATTTrCTACTGGAAGCATTTGATTCCAGTCAAGCAGACGATTCGAAGGCATTCTATGGATTAATG -7857 
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l I \ S K F Q N 1 A 401 
CTGATTTATTCAATGTTTCAGAATAATGqtgagttttaaaaaattgatttgttaaattaaaatttccatttccaataactcctcttcagacagtaagttt -1157 

tcaatgttgtaaagttcctgttcatctgtgatcgttttcttcatttttttagttttgcatgaacagttttcaaatttttttgatatcatacagtaaatat -1657 

cgtcatccagataattttctatttaaaaaaaatgaataaaaagagggcgcgcagaaattgccgaagtaatgtaaatttaaagggacacatgcgtagcttg -7551 

ttgtgtgggtctcgccgcgctttgtttgatttatcttgttttctgctcaaagagctgtttttattttagcgttgaatgcttttttaccgttctcatcggc -7451 

tttttaataggaatatttaaaaaaaaaggtttaataaatcttcgtttttacaaaatccatctaagatttgcatttgtgaagctcaacaagtaaagtttta -1351 

agtaacattgttttttaaaaaacaattgaaccaaattttgccgaaacattaataacatgacgatactctataaaatattcctcttttcaaaataaatttt -1257 

D V G E L L S A A H F P V L K E S T T T S L A Q Q N 427 

caaaaaaaatccatttttcagCCGATGTTKAGAACTTCTATOT -7157 

T SHPI74 

L A R L R I A S T S S I S K R T 1 A I T E I G V E A T E E D E I F 480 

TCTTGCTCGTCTCCGMTfiGCATCTACGTCTTCCftTATCAMGCGAACGAGAGCTATCACTG AMTTGGAGTAGMGCGACCG AGGRAGATGRGATTTTT -7057 

SHP185 * 

H D V P E E Q T L 469 

CATGATGTTCCTGAAGAACAAACGTTGgtaagtaaataaatcaacattgattgttacacaaactttaatatttttaaatttgaaaattttcttcaaagtg -6957 

EDLVDDVLVDTENSAISDPE 489 
ctcaaaaatcctgtcgaaaattacagGAAGATCTGGTGGATGATGTATTGGTTGATACTGAAAATTCAGCAATAAGTGATCCAGAAgtgagtagaaaacg -6857 

PKHVESESR 498 

tgcatgtattaattattaaaaaaaaaatatagttttccccagttttccttgacctaaaactcagcaatttcagCCTAAAAACGTGGAGTCAGAATCTCGT -6757 
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SRFQSAVDELPPPSTSGCDGRLFDALSSIIKAVG 532 
TCTCGATTTCftATCTGCTGTTGATGAGCTTCCACCTCCGTCGACTTCTGGATGTGATGGTCGACTTTTTGATGCACTTTCATCGATTRTCRAAGCAGTTG -6651 

T D D H R I R P I T I E L A C L V I R Q I I H T V D D E ft 561 
GAACAGATGACAATCG AAnCGACCAATTACATTGGM CTTGCATGTCTTGTMTTCGGCAAATTTTAATGACTGTTGATGATGAAAMqtaagattaca -6551 

SHP175 T 

VHTSLTKLCFEVRLRLLS 519 
aattcaaaattqagcaaaatcaqaatctaaatttcataaattqttcagGTACATACCAGTTTAACGAAATTATGCTTCGAAGTTCGTCTAAAACTTTTAT -6151 

S I G Q Y V N G E N L F L E W F E D E Y A E F E 603 
CATCAATTGGACAATATGTTAATGGAGAGAATCTGTTTTTGGAGTGGTTTGAGGATGAATATGCAGAATTTGAAgtaagccaagaggtccgaaaataatt -6351 

VHHVNFDIIGHEHLLPPAATPLSNLLL 630 
taattcatcctttttattcagGTGAATCACGTGAATTTCGATATAATCGGTCACGAAATGCTTCTTCCTCCAGCTGCAACTCCTCTTTCGAATCTGCTAC -6251 

HKRLPSGFEERIRT Q I V 641 

TTCATAAGCGATTGCCCAGTGGATTTGAAGAACGAATAAGAACTgtaggaaactttttaaatttgaaaattaattatatatatatttgcagCAAATCGTA -6151 

FYLHIRKLERDLTGEGDTELPVRVLNSDQEPVAI 681 
TTCTACCTACATATTCGAAAATTGGAACGAGATTTGACCGGTGAAGGAGACACAGAATTACCTGTGAGAGTGTTGAATTCTGATCAGGAACCAGTTGCCA -6051 

G D C I H L H HSDLLSCT 696 

TCGGTGATTGTATTAATTTACgtgagttcatctgcatagaaaacaccatatttctactcaaattaacaattttcagATAATTCGGATCTTCTATCCTGCA -5951 

V V P Q Q L C S L G I P G D R L A R F L V T 0 R L Q L I L V E P D 129 

CTGT GGTTCCTCAACAACTATGTTC TCTTGGAAAACCTGGTGATCGTCTTGCTCGATTCCTTGTCACTGATAGACTTCAATTAATTCTTGTCGAACCGGA -5851 
J SHP176 

S R K A G W A I V R F V G L L Q D T T I N G 0 S T D S K V L B V V 162 

TTCTCGAAMGGCGGATGGGCAATTGYTCGATTCGTAGGACnCTTCAAGATACAAC AATTAATGGAGATTCTACGGA TTCGAAAGTTTTGCATGTTGTG -5151 

~ SHP17? ? 

V E G 0 P S R I K K R H P V 1 T A 119 

GTGGAAGGGCAACCCTCGAGAATTMGgtaagaatactaacgggaaaaaaaaatcaaaaaattacttctgtttcagAAMGACATCCGGTTTTAACTGCA -5651 
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AFIFDDHIRCNAAKQRLTK 798 
AAGTTCATATTCGATGATCACATTCGGTGTATGGCAGCAMGCAACGGCTCACCAAGgtaacggaaaaaataaccaaaaagacggaaagttattgtaaat -5557 



ggacgaaatcggcgaaattaattgaaaacgtttgaatttgccgctaaaaccaaacgaaaaccaaacgaaagcgaaatttaactatcccttcaggtagaat -5157 

GRQTARGLKLQAICSALGVPRIDPAT 824 

atacattttatttctctttatagGGTCGCCAAACAGCACGTGGTCTGAAACTTCAGGCGATATGTTCAGCTCTTGGAGTTCCACGTATCGATCCAGCGAC -5357 

MTSSPRMNPFRIVKGCAPGSVRKTVSTSSSSSQ 857 

AATGACGTCATCACCACGMTGAATCCATTCAGMTTGTGAMGGATGCGCACCGGGMGTGTACGAAAAACTGTTTCCACATCATCATCGTCAAGCCAA -5257 

GRPGHYSAHIRSASRNAGMIPDDPTQPSSSSERR 891 

GGACGTCCCGGACATTATTCTGCAAATCTTAG ATCAGCATCTAGAAATGCAGG AATGATACCAGATGATCCAACTCAACCGAGTAGTTCTTCGGAAAGAA -5157 

SHP178 J 

S . 892 

GATCCtagggatcaatatctcttcagtttcatcattttatgctgtaaattgtatttaagtattcctattctttgtagtactgtatttacacatcgtctag -5057 



ttaaaatcacaaatctccgaaaaaacaaaccagtgaacatgtgatatttctcttgcccatagttctcttttttttttgaaacaaaaacaattacttttat -4957 

polyA 

gctcacctattcgagccatatttttttcccaattaccggttgtttattttaatttcttttttttttctgtaaatctactttatttttaaaactgcatttg -4857 



agattgtgtatattttttcaaaatggttcaaatgccgaatctatctactt -4807 
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^ MAEKAENLPSSSAEASE 1 

tttaatcattattcaaacagaaaaaccgattatttattcagattctcaaaaATGGCTGAAMAGCTGAAAATCTTCCATCTTCTTCGGCCGAAGCTTCAG -470 

EPSPQTGPNVNQKPSILVLGMAGSGKTTFVQ 4 

MGAGCCATCACCTCAMCTGGACCAMTGTGAATCAAMfiCCATCGATTTTGGTTCTTGGJATGGCTGGTTCTGGAMAACGACATTTGTTCAGgtaac -460 



RLTAFLHARKTPPYVINLDP 6 

tttcattcaattttgagagttttcaaacattactattttcagCGTCTCACAGCATTCCTACATGCTCGTAAAACACCTCCATATGTGATTAATCTGGATC -450 

A V S K V P Y P V N V D I R D T V K \ I E V M K E F G H G P H G A 10 

CGGCAGTTAGCAAAGTACCTTRT CCAGTGAATGTTGACATTCGA GATACTGTGAAATACAAGGftAGTTATGAAAGAATTCGGAATGGGACCAAATGGAGC -440 

T SHP179 

IHTCLNLMCTRFDKVIELINKRSSDFSVCLLDT 13 

AAHATGACATGT CTTMCCTGATGTGTACTCCT ^ -130 

SHP180 

P G Q I E A F T W S A S G S I I T D S L A S S H P T 16 

CCTGGACAMTTGMGCATTCACTTGGAGTGCTAGTGGATCTATTATCACTGATTCATT GGCAAGTAGCCATCCCACGgt aagggattttgatttatgaa -420 

* SHP143 



atctgcttgaaatgaaaaaagattctaataaatttttgacttttaaacattttttacagttatatttggtctattttctatcattaaaagcaaaatgaaa -410 

V V H y I V D S A R A T N P T T F H S N 18 
agtcgattctactccatatttattaatttcgactttto^ -400 

! SHP144 

_fs=:i4A 
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H L V A C S I L Y R T K L P F I V V F I I A D I V K P T F A L K » M 21 
ATGCTCTACGCATGTTCCATTCTCTACCGTACCAAACTTCCATTCATTGTCGTTTTCAACAAAGCTGATATTGTCAAACCAACATTTGCACTCAAATGGA -390 

QDFERFDEALEDARSSYKNDLSRSLSLVLDEFY 24 
TGCMGATTTCGAAAGATTTGATGAAGCTTTAGAGGATGCCAGAAGCAGTTATATGAATGATTTGAGTCGTTCATTGAGTCTCGTTCTTGATGAATTCTA -380 

T 

SHP181 

C G L K T V CVSSATGEGFEDV 26 

TTGCGGACTGAAAACAGgtttttattcgaaataaaaccttttttaaataataaatttcagTTTGCGTCAGTTCTGCAACTGGAGAAGGATTCGAAGATGT -310 



HTAIDESVEAYKKEYVPMYEKVLAEKKLLDEEE 29 
AATGACAGCAATCGATGAAAGTGTTGAAGCATACAAAAAAGAATATGTTCCAATGTATGAAAAAGTGTTGGCTGAGAAAAAACTATTGGATGAGGAGGAG -360 



R K I R D E E TLKGKAVHDLNKV 31 

AGAMGAAAAGAGATGAAGAGgtaattgtagtaatttaattctgattatcttcaaattttcaqACTCTGAAAGGAAAAGCTGnCACGACCTGAACAAAG -350 

ANPDEFLESELNSKIDRlfiLGGVDEENEEDAEL 35 

TCGCCMTCC CGRCGMTTTCTGGAGTCGG AGTTGAATTCAAAAATCGATAGAATTCATTTGGGCGGAGTCGATGAAGAGAATGAGGAGGATGCTGAACT -340 

SHP182 ~* 

E R S • 35 

CGAAAGATCCtgattttctttttgtttttgaatttttattctattttqatccctgtttacttcttattgttctcattttgttqcgttgttttacatttta -330 



ctcatttttgcataaacttgttgcaaaaataiatataatttttgatctggaaatggttttaaaccttaacctttcatatattaataattttttttcaaaa -320 



aaacgttctaaaaaggttcctcattttttcaatataggaaattttgaaqa -315 
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SL2 

~\ H S E I T F H K 8 

tcttttccaaaaatgaggttcttcgcttgaaaagccaacatttaaaacctttttttttccagaaacctagtggttaATGTCTGAAAAGACGTTCCACAAG -3057 

A Q T I R A K A S G V P S I V E A V Q F H G V R I T I 1 D A L V I E 42 
GCACAGACCATCCGTGCAMGGCATCCGGAGTGCCTTCMTCGTCGAAGCTGTACAGTTTCATGGAGTTCGCATCACAAAAAACGATGCT1TCGTTAAGG -2951 

V S E L y R 48 

AGgtactacccaaatttcaaaatgttgcacaattcaattgaaaatataaattgtgaattaaattcaacttacatgttttttcagGTTTCCGAATTATACA -285 

SKNLDELVHNSHLAARHLQEVGliHDNAVALIDT 81 

GAAGTAAAAATCTAGATGMCTTGTTCATAACTCTCATCTGGCGGCTCGTCATCTTCAAGAAGTT GGArTAATGGATAATGCAGT TGCTCTAATTGATAC -275 

f SHP183 

S P S S N E G ! V V N F L V R E P K S F T A G V K A G V S T H G D 114 

ATCTCCMGCTCAAATGAAGGATftTGTTGTCAATrTCCTAGTTCGRGAACCAAAATCATTCACTGCTGGAGTCAAAGCAGGAGTTTCAACGAATGGAGAT -26 

A D V S L N A G K Q S V G G R G E A I H T Q ! T K T V K 14 

GCGGATGTCAGTTTAAATGCCGGAAAACAMGTGTTGG AGGACGAGGAGAGGCAATCAAT ACACAGTATACATATACTGTAAAGgtaaggacgagagttq -255 

T SHP145 

gcactgccagtttggcatgttctcccaatattttttaattataaaatttggaagtataaaaaaatgtttgcttcatctaaaaatagcctttttcacatga -245 



aaaaaattgaaaaaaagtgctcaaaaatttcagaaatttccaatttccaaacaattttggagaactttcaaaaatttttccaactgaaattaaagctata -235 
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jojj-j continued,,, G D « c f in 

ttctatcactaaattttatacaagtcttaaqagaaaatgatqaagtggctcattttqtagaatttcctaaaaaataatatcttcagGGCGATCRCTGCTT -225 



H I S A I K P F L G « Q K Y S I V S A T I \ R S L A B M P I N Q S 180 

C AACATTTCCGCAATCAARCC ATTCCTGGGATGGCAAAAATRTTCGAATGTATCftGCGRCTCTATftCCGTTCACTTGCACATATGCCAT GGflATCAATCft -215 
SHP138 * 1 SHP146 

DVDENAAVLAYNGQLWNQKLLHQVKLNA 208 

GATGTTGAT GAGAATGCAGCTGTTCTTGCATATAATGGACAACTATGGAATCAAAAGCTTTTGCATCAAGTCAAATTGAATGCGgtaaagtattataagt -205 

I I R T L R A T R D A A F S V R E Q A G H 1 L 23 

gttttgtccaaactatgatacagttcttcagATATGGAGAACACTTCGTGCCACTCGAGATGCCGCATTTTCAGTTCGTGAACAAGCCGGACACACTTTG -195 

K F S L E N A V A V D T R 0 R P I L A S R G I L A 25 

AAATTCTCGTTGGAGAATGCTGTAGCTGTTGATACAAGAGATAGACCTATTCTTGCAAGTCGTGGAATTCTTGgtaagagtaacaacgactatttttaaa -185 



aaatatctttttcgaaaaaattacgaacgaaaaaaaactgtattatgtacccaaacgcgaaattttgcagttcttgcgcgttcttgttgataaaaaatat -175 

R F A Q 26 

gtaaaaaattggaaaaactacgaaaagtcgataaaaattccgtaccaaccggaaaatgtttcattaatttctcttccttttttcagCTCGTTTTGCTCAA -165 

EYAGVFGDASFVKNTLDLQ 219 

GAGTACG CAGGAGTATTTGGTGATGCGT CATTTGTGAAGAATACATTAGATTTACAGgtaacaaccttatttcaacaattatttcaaattctattaaaaa -155 
SHP133 1 

A A A P L P L G F I L A A S F Q A K H L K G L G D R E V H I L 31 

taattccagGCA GCTGCCCCTCTOACTCGG TTTCATTCTTGCCGCCTCATTCCAAGCGAAACATTTGAAAGGACTCGGAGATCGAGAAGHCATATTT -H5 

SHP140 1 
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DRCYIGGQQDVRGFGLNTIG 330 
TGGATflGATGTTATTTGGGT GGRCAACRGGATGTTCGAGGATTTGGTCTGRATACTRTTGGfigtgagttttaacgaaattctcttgaaagtcaaataatc -1357 
T SHP184 

V K A D H S C I G G G A S L A G V V H L I R P L I P P N H I F 361 
attttcagGTTAAAGCAGATAACAGTTGTCTTGGAGGAGGTGCTTCACTTGCTGGTGTCGTTCATTTGTATCGGCCATTGATTCCACCAAATATGCTATT -1257 

A H A F L A S G S V A S V H 5 K N L V Q Q L Q D T Q R V S A G F G 394 
TGCACACGCATTCCTTGCATCTGGAA GTGTTGCATCAGTTCATTCC AAAAATTTGGTGCAACAATTACAGGATACTCAACGAGTATCAGCCGGATTTGgt -1157 

SHP163 T 

gagtttgaaatttaggaaacatttggatgaaatgtattttttaaaaatagatcagctttatttatttgaaaaaaaacgctcattaatcaatagtgatagt -1057 

tccattctgagtttcttcttcttcctcqcggaatacaatttttgacttgttcgcatccttcttgtgtactttgtcaccaatcttctcatcaactaaatct -957 

cgaaactgaaaaaatttcaaaattattccaaaaaatattgatgcagactacctttttgatggcttctggtacgtttctagcgtcgaatggattggctcct -857 

ccaataattaaagtctcgttcggtagtttagccagacggacggtgtgcttcaacatttttctaattaatctatttcaattcaagtcactcactctctctt -757 
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go/hi continued,,, 

gacgtcttcttctatattccaagaactctgcagaaaatccgtgtccgccttqtgtgtttctagttggcgtcggaggattcacgggtccaagacgaatgga -651 

tgtctaaaaaatgttatatttttgcataaagaaaacaccataccttcaccactttttgagttgtgggcgttctgaatggaattgatcgattattattgct -557 

ctttcttgatttgcttctatcagctgcgtaatgaggtgttctaaagatcagctttaattcatttggacaagtgctcctctaataaacttaccctgtactc -451 

atttttgaaacgatttacgatgataagattgaaagtggaagttaaatttagtctttcaaagttgaaataaaatcttcataaataaataaatttaaatgaa -357 

I A F V F K S 401 

agattaaataaattaacgttcacgtagttaaaaaaataatttaaatcttaaacttctaataaaaaatctcaattttccagGACTCGCATTCGTGTTCAAAA -251 

I F R L E L I I T Y P L K Y V L G 0 S L L G G F H I G A G V (J F L 434 
GTATTTTCCGGCTGGAACTCAACTACACGTATCCATTGAAATATGTGCTCGGCGATTCATTGCTCGGTGGATTCCATATTGGAGCTGGTGTCAACTTCTT -151 

Gtagaga ttaattggatgcaagcacccct caaaaagatttttttgaaaaacgataaattcacagaatttcagttctttttctcccccttttattgttatt -51 
SHP134 



ttcatcgtaa tgctgtgctagaagtcagag taaatatgagtttttttgtgttctaggaattccattttttcaggaagcaaatttaataaaaattatcgaa 44 
SHP164 T 

polyA 

r 

tttcttgctctaaagatgttgtacattttat ggaaatgbtcgtatagtaa 94 

* SHP135 
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SL2 

-\ HSLRKINFVTG 11 
tt cgaacactttatatttctcg ttttaaaactgtcggtqttttatagtaaactatcttcagaaaaaaATGAGCCTACG MAAATCAflTTTCGTAACTGGA 194 

~~^ ] » SHP118 * 



NVKKLEEVKAILKNFE 27 
MCGTGMGMKTTGAAGMGTCMGGCTATTTTGMGMTnCGAGgtaaaatatatttgatattattcgaacgcqaaattttqcgccaaaagtacga 294 



tgcctggtctcaacacgacaatattttgttaaatacaaacgaatgtgcgccttcaaagaaaagtttcaatctttcgttgccgtggagatatttttagagt 394 



VSNVDVDLDEF 38 

ttttgtttaaattatatatttgtcgtatcgaaaccgggtaccgtaatcaatcaattaaatattttcagGTTTCAMCGTGGATGTCGATTTGGATGAAn 494 



SHP165 

0 G E P E F I A E R K C R E A V E A V ft G P V L 62 

CCMGGAGRACCCGAATTTATTGCCGMAGAMGTGCCGTGAGGCTGTTGMGCTGTAAMGGGCCCGTTTTGgtatggaaaattgtatttgttctaaaa 594 



VEDTSLCFHAHGGLPGPyiKSFLKHLKPE 91 
attgtcaaatttcagGTCGAAGACACAAGTTTATGCTTCMCGCAATGGGCGGTCTTCCTGG ACCTTATATCAAGTGGTTTTTG AAGAATTTGAAACCAG 694 

SHP12J 

-Jt£=3--1BA 
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kf-\ continued... 

. G L H N H L A GFSDKTAYAQCIF 111 

AAGGACTACATftATATGCTAGgtaaatattttaattttttgaaaaaacttatttttcagCCGGRTTTTCTGACAAAACCGCCTATGCTCAATGCATCTTT 194 



AYTEGLGKPIHVFAG 126 
GCGTACACTGAAGGACTCGGAAAACCTATTCATGTATTTGCTGgtatgattttttqaatttaattctttaattttatatgttaatttagttgtttcattc 894 



K C P G Q I V A P R G D T A F G I D P 145 
ctcaatttatqagagatttttttttcaatttttctatttcagGAAAATGTCCTGGTCAMTTGTTGCTCCACGT GGTGATACTGCTTTTGGATGG GATCC 994 

T SHP130 



CFQPDGFKETFGEMDKDVKNEISHRAKALELLK 178 
ATGCTTCCAGCCAGATGGTTTTAMGMCATTCGGAGAAATGGATAAAGATGTAAAMTGAMTTTCTCATO 1094 



1 SHP119 SHP120 ' 



E Y F Q N H • 184 
GAATATTnCAGAATAATtaaattattttttctcatctatgcaatttcttgaaaatttgttaagtttccgttgttatgcatttgcttttatttaaaaaaa 1194 



r 

aaagaatatttttacattaatattagatatgagaaaagagtaatttctggattttaaccttcctacaaaagaatatttatattttttgtatgatttttta 1294 



SHP93 

-h==r - IBB 
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(1) GENERAL INFORMATION: 
(i) APPLICANT: McGlLL UNIVERSITY 

(ii) TITLE OF INVENTION: THE C. ELEGANS gro-1 GENE 

(iii) NUMBER OF SEQUENCES: 62 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: SWABEY OGILVY RENAULT 

(B) STREET: 1981 McGill College Avenue - Suite 1600 

(C) CITY: Montreal 

(D) STATE: QC 

(E) COUNTRY: Canada 

(F) ZIP: H3A 2Y3 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette 

(B) COMPUTER: IBM Compatible 

(C) OPERATING SYSTEM: Windows 

(D) SOFTWARE: FastSEQ for Windows Version 2.0b 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(Vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: CA 2 f 210, 251 

(B) FILING DATE: 25-AUG-1997 

(viii) ATTORNEY/ AGENT INFORMATION: 

(A) NAME: Cote, France 

(B) REGISTRATION NUMBER: 4166 

(C) REFERENCE/ DOCKET NUMBER: 1770-179PCT FC/ld 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 514 845-7126 

(B) TELEFAX: 514 288-8389 

(C) TELEX: 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14458 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 




60 
120 
180 
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TAGAATAATG AAACATTTTC AGAATTCATT ACCGTCAATG TCAGATAGTC ATT CCTTGAG 24 0 

TATTTTGTGG ATGCTTTGAA AATTCTTCGC T GGG C CAT AT CTGTTGGATA AT CT GAAAAA 300 

CGCAATAAAT TTCATCGAAA ATGCCTATTA AATTGAATTA CCTTCTTCTT CAT CATTT C C 360 

TAACAATTCA TGCTCTTTTT GTGCTTGACT TGTGACCAAT TCTTTAAATT CAATTAAATC 42 0 

GTCAATAT CC TTTTGTACTA AATCCATCTT GATATTCAAT ATATCTTTGT CAGTATAGTA 480 

TTCAGCGTAT CTGAAATTTC GAATTTATTT TTCTAATTCC CAAGAAAAAT AATTAATAAG 54 0 

AATACCTTAA CGAATTATTA TCCAATATAT CATCATTTGC CACATCTGGA AGACGCTGAG 600 

GAACTGTTTG AGCAGCTTGG AGGTAGTCGT CATCGTCTCT GGAAATT GTT ATTTTCAATT 660 

TCAAAAAAAA AACTTTACTT AC GAAAT AT A CTCATTTGAT GCAATCCACG GATCAAAACG 720 

ACGTCTTTGC ATCTTTGAAT CATTTTCCGC ATGGCACCGC ATCACTT CTT T CT TAT GAT T 780 

ATTTTCTAAC GTTTTTGAAA ATTCGACGTG CTCTTCACAA CGGCCGCCAT GTTTCGCAAG 84 0 

TTCTTCTTTT GAT C GT AT CT AAAATTTTAA ATTT GAAAAA AAGCTTACTA TCAAATTTTC 900 

GTATTTTTTC TCACCTGCTT ACACCGAACA AGCGTTCGAT ACGAAGCATA AT T AC AT T GT 960 

CCATACTTAT TTTTGTCGTA TTCATTGGCA ACAAGACGGA ATCGTGTTCC AGGTGCAACT 102 0 

ATATATTGAG CAGGAGGACG AGTTGTTTGT TTCATGCTGC TTAAAAATAA AAATGGAAAA 1080 

TTGAGTCAAA AAGTTGAGAT AAAACAAATT AAAACAATTT TCTGAAAAAT AAACAACTGA 1140 

AATTTGAAGT AATAAACAAC ACGCGAAAAC GTTATTT CGG AGCATCGTTT GAGAAGTAAA 1200 

ACTTTTTTTC GGCGCACCCT TGTGCGCAGT TTTTATCTTC TCTTTTAATT TAATTTT CAA 1260 

GCTAAATCTT TCTTTTTAAA CTTTGAATAA ATATTTAAAT ATT CAGAAT G CACCAATAAA 132 0 

CCTGGAACAA AAT C G AT AAT GTTCCGCAAG CTTGGTTCTT CTGGGTCACT ATGGAAGCCG 1380 

AAAAATCCGC ATTCTTTGGA ATACCTCAAA TATTTACAAG GAGTGCTCAC AAAAAATGAG 144 0 

AAAGTTACGG AAAACAATAA G AAAAT AT T A GTAGAAGCAT TACGAGCTAT CGCAGAAATT 1500 

CT CATTT GGG GCGATCAGAA TGATGCTTCG GTTTTTGAGT GAGTTTTTTT CCAATGTTTT 1560 

TTTTCAAATC TGATGTTGAA TTTCAGTTTC TTCCTTGAGC GGCAAATGCT TCTTTATTTC 1620 

TTGAAAATTA TGGAACAAGG AAACACACCA CTAAAT GT AC AATT ACT G C A GACTTTGAAC 1680 

ATTTTATTCG AAAATATTCG ACAT GAAACT TCACTTTGTA AGTTTTTTAT ATGGATTTTC 1740 

GCTTAAAATT GCCAGTTTTC AGATTTCCTT CTAAGTAACA AT CAT GT AAA CT C GATTATT 1800 

TCCCACAAAT TCGATTTACA AAAT GAT GAG ATCATGGCTT ACTACATTAG TTTTCTGAAA 1860 

ACTCTTTCAT TTAAACT GAA TCCAGCTACA AT C CACTT CT TCTTCAATGA AACGACTGAA 1920 

GAATTTCCAT TGTTGGTAGA AGTTTT GAAG CTTTATAATT GGAATGAATC AATGGTTCGA 1980 

ATTGCTGTTA GAAAT AT T CT TTTAAATATT GTGAGAGTTC AAGATGATTC AAT GATTATT 204 0 

TTCGCTATCA AG CAT AC AAA AGTTAGTAGA AAATTATTTT GAAAAGGTGT ATTTAAGCAA 2100 

TAAATATTAC AG GAATAT CT ATCGGAGTTA ATAGAT T CT C TAGTTGGTCT CTCACTTGAA 2160 

ATGGACACAT TTGTACGATC TGCTGAGAAT GTGTTAGCTA AT C GAGAGAG ATTAC GAGGA 2220 

AAAGTGGATG ATTTAATTGA TT TGATT CAT TATATTGGTG AACTATTGGA TGTGGAAGCT 22 8 0 

GTCGCCGAAA GTTTATCAAT T T TAG GT GAG TTTTACTGCT G G AAAAT CAA GTTTTTAATG 234 0 

TTAAATTTTC AGTAACAACA CGATACTTAA GCCCTCTATT ACTTTCAAGT AT AT CAC CAA 2400 

GAAGAGATAA TCATT CACTT CTACTCACTC CGATTTCTGC GTTATTTTTT TTCTCTGAAT 24 60 

TTTTATTGGT GAGTTTTAAC ATTTAAAATT ACATTTTTCT AAT T T AT TT A TTTTTCAGAT 2520 

AGTTCGTCAC CAT G AAAC AA TATATACATT TTTATCATCT TTCCTATTTG ACACTCAGAA 2580 

T ACT TT GAC G AC C CAT T GGA TACGTCATAA T GAGAAAT AT TGCTTAGAAC C GAT TAG AT T 2 640 

AT CAT CACCA AC C G GAGAAT ATGTGAATGA AGACCAGTAA GAGCT GAAAT TTTAAAATTT 2700 

TTGCTTTGAA TATAGTATTT TCAGCGTATT TTTCGATTTT CTACTGGAAG CATTTGATTC 27 60 

CAGTCAAGCA GACGATTCGA AGGCATTCTA TGGATTAATG CT GAT T TATT CAATGTTTCA 2820 

GAATAATGGT GAGTT TTAAA AAATTGATTT GTTAAATTAA AATTTCCATT TCCAATAACT 2 880 

C CT CTT CAGA CAGTAAGTTT T C AAT GT T GT AAAGTTCCTG TTCATCTGTG ATCGTTTTCT 2 940 

TCATTTTTTT AGTTTTGCAT GAACAGTTTT CAAATTTTTT T GAT AT CAT A CAGTAAATAT 3000 

CGTCATCCAG ATAATTTTCT ATTTAAAAAA AAT G AAT AAA AAGAGGGCGC GCAGAAATTG 3060 

CCGAAGTAAT GT AAAT T T AA AGGGACACAT GCGTAGCTTG TTGTGTGGGT CTCGCCGCGC 312 0 

TTTGTTTGAT TTATCTT GTT TTCTGCTCAA AGAGCTGTTT TTATTTTAGC GTTGAATGCT 3180 

TTTTTACCGT TCTCATCGGC TTTTTAATAG GAATAT TTAA AAAAAAAGGT TTAATAAATC 324 0 

TTCGTTTTTA CAAAATCCAT CT AAGATTT G CAT T T GT G AA GCTCAACAAG TAAAGTTTTA 3300 

AGTAACATTG TTTTTTAAAA AACAATT GAA CCAAATTTTG CCGAAACATT AAT AAC AT G A 3360 

C GAT ACT CTA TAAAATATTC CTCTTTTCAA AATAAATTTT CAAAAAAAAT CCATTTTTCA 342 0 

GCCGATGTTG GAGAACTTCT ATCTGCTGCC AACTTCCCAG TGCTCAAAGA AT CAAC GAC A 3480 

ACTTCATTAG CT CAACAGAA TCTTGCTCGT CTCCGAATAG CATCTACGTC TT C CAT AT CA 3540 

AAGCGAACGA GAGCTATCAC T GAAAT T GGA GTAGAAGCGA CCGAGGAAGA TGAGATTTTT 3600 

CAT GAT GTT C C T GAAGAAC A AACGTT GGTA AGTAAATAAA TCAACATTGA T T GT T AC ACA 3660 

AACTTTAATA TTTTTAAATT TGAAAATTTT CTTCAAAGTG CTCAAAAATC CTGTCGAAAA 372 0 
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TTACAGGAAG ATCTGGTGGA T GAT GT AT T G GTT GAT ACT G AAAATTCAGC AAT AAG T GAT 37 8 0 

CCAGAAGTGA G T AGAAAAC G TGCATGTATT AAT TAT T AAA AAAAAAATAT AGTTTTCCCC 384 0 

AGTTTTCCTT G AC CT AAAAC TCAGCAATTT CAGCCTAAAA ACGTGGAGTC AGAATCTCGT 3900 

TCTCGATTTC AATCTGCTGT TGATGAGCTT CCACCTCCGT CGACTTCTGG ATGTGATGGT 3 960 

CGACTTTTTG ATGCACTTTC AT C GAT TAT C AAAGCAGTTG GAACAGAT GA CAATCGAATT 4 020 

CGACCAATTA CAT T G GAACT TGCATGTCTT GTAATTCGGC AAATTTTAAT G ACT GTT GAT 4 080 

GAT G AAAAAG T AAGAT T AC A AATT CAAAAT T GAG C AAAAT CAGAATCTAA AT T T CAT AAA 414 0 

TTGTTCAGGT AC AT AC CAGT TTAACGAAAT TATGCTTCGA AGTTCGTCTA AAACTTTTAT 4200 

CATCAATTGG ACAAT AT GTT AAT G GAGAG A ATCTGTTTTT GGAGTGGTTT GAG GAT GAAT 4260 

AT GCAGAATT T GAAGT AAG C CAAGAGGTCC GAAAAT AAT T T AAT T CAT C C TTTTTATTCA 4 320 

GGTGAATCAC GTGAATTTCG AT AT AAT C G G TCACGAAATG CTTCTTCCTC CAGCTGCAAC 4380 

TCCTCTTTCG AATCTGCTAC TTCATAAGCG ATTGCCCAGT G GAT T T G AAG AAC GAAT AAG 444 0 

AACTGTAGGA AACTTTTTAA ATTT GAAAAT T AAT TAT AT A TATATTTGCA GCAAATCGTA 4 500 

TTCTACCTAC AT AT T C GAAA ATTGGAACGA GATTTGACCG GTGAAGGAGA CACAGAATTA 4560 

CCTGT GAGAG TGTTGAATTC TGATCAGGAA CCAGTTGCCA TCGGTGATTG TATTAATTTA 4 62 0 

CGTGAGTTCA T CT G CAT AG A AAAC AC CAT A TTTCTACTCA AATTAACAAT T T T C AG AT AA 4 680 

TTCGGATCTT CTATCCTGCA CTGTGGTTCC TCAACAACTA TGTTCTCTTG GAAAACCTGG 4 74 0 

TGATCGTCTT GCTCGATTCC TTGTCACTGA TAGACTT CAA TTAATTCTTG TCGAACCGGA 4 800 

TTCTCGAAAA GCCGGATGGG C AATT GTT CG ATTCGTAGGA CTTCTTCAAG ATACAACAAT 4 8 60 

TAAT GGAGAT TCTACGGATT C GAAAGTTT T GCATGTTGTG GTGGAAGGGC AACCCTCGAG 4 920 

AAT T AAG GT A AGAATACTAA CGGGAAAAAA AAATCAAAAA ATTACTTCTG TTTCAGAAAA 4 980 

GACATCCGGT TTTAACTGCA AAGT T CAT AT TCGATGATCA CATTCGGTGT ATGGCAGCAA 504 0 

AG C AAC G G CT C AC C AAG GT A ACGGAAAAAA TAACCAAAAA GACGGAAAGT TATTGTAAAT 5100 

GGACGAAATC GGCGAAATTA ATT GAAAACG TTTGAATTTG CCGCTAAAAC CAAACGAAAA 5160 

CCAAACGAAA GCGAAATTTA ACTATCCCTT CAGGTAGAAT ATACATTTTA TTTCTCTTTA 5220 

TAGGGTCGCC AAACAG C AC G TGGTCTGAAA CTTCAGGCGA TATGTTCAGC TCTTGGAGTT 5280 

C C AC GT AT C G ATCCAGCGAC AATGACGTCA TCACCACGAA T GAAT C CAT T CAGAATTGTG 534 0 

AAAGGATGCG C AC C GG GAAG TGTACGAAAA ACT GTTT CCA CATCATCATC GTCAAGCCAA 5400 

GGACGTCCCG GACATTATTC TGCAAATCTT AGATCAGCAT CTAGAAATGC AGGAATGATA 54 60 

C C AG AT GAT C CAACTCAACC GAGTAGTTCT TCGGAAAGAA GATCCTAGGG AT C AAT AT C T 5520 

CTTCAGTTTC AT CAT T T TAT GCTGTAAATT GT ATT T AAGT ATTCCTATTC TTTGTAGTAC 5580 

TGTATTTACA CAT C GT CTAG TT AAAAT CAC AAATCTCCGA AAAAACAAAC CAGT GAACAT 5 640 

GT GAT AT T T C TCTTGCCCAT AGTTCTCTTT TTTTTTTGAA ACAAAAACAA TTACTTTTAT 57 00 

GCTCACCTAT T C GAG C CAT A TTTTTTTCCC AAT TAC CG GT TGTTTATTTT AATTTCTTTT 5760 

TTTTTTCTGT AAATCTACTT TATT T TT AAA ACT GCAT T T G AGATTGTGTA TATTTTTTCA 5820 

AAATGGTTCA AATGCCGAAT CTAT CTACTT TTTAATCATT AT T CAAAC AG AAAAAC C GAT 58 80 

TATTTATTCA GAT T CT C AAA AATGGCTGAA AAAGCTGAAA AT CTT C CAT C TTCTTCGGCC 5940 

GAAGCTTCAG AAG AG C CAT C ACCTCAAACT GGACCAAATG TGAATCAAAA ACCATCGATT 6000 

TTGGTTCTTG GAAT G G CT GG TTCTGGAAAA AC G AC AT T T G TTCAGGTAAC TT T CAT T CAA 6060 

TTTTGAGAGT T TT CAAAC AT TACTATTTTC AGCGTCTCAC AGCATT CCTA CAT G CT C GT A 6120 

AAACACCTCC AT AT G T GAT T AAT CT G GAT C C G G CAGT TAG CAAAGT AC C T TAT C CAGT GA 6180 

AT GTT GACAT T C GAGAT ACT GTGAAATACA AG GAAGT TAT GAAAGAATTC GGAATGGGAC 6240 

CAAATGGAGC AATT AT GAC A TGTCTTAACC T GAT GT GT AC T C GTTT T GAT AAAGTAATTG 6300 

AGTTGATTAA T AAGAGAT CT TCTGATTTCT CAGTTTGTCT TCTT GAT ACT CCT GGACAAA 6360 
TTGAAGCATT CACTTGGAGT GCTAGTGGAT C TAT TAT CAC TGATTCATTG GCAAGTAGCC 642 0 
ATCCCACGGT AAGGGATTTT GATTTATGAA ATCTGCTT GA AATGAAAAAA GATTCTAATA 64 80 
AAT TT TT GAC TTTTAAACAT TTTTTACAGT TAT AT TT G GT CTATTTTCTA T CAT T AAAAG 6540 
CAAAAT GAAA AGTCGATTCT ACTCCATATT TATTAATTTC GACTTTT CAG GTGGTAAT GT 6600 
ACATTGTGGA TTCCGCTCGT GCCACAAATC CAACTACATT CAT GT C CAAT AT GCT CTAC G 6660 
CAT GT T C CAT TCTCTACCGT ACCAAACTTC CAT T CAT T GT CGTTTTCAAC AAAGCTGATA 6720 
TTGT CAAAC C AACATTTGCA CTCAAATGGA T G C AAG AT TT C G AAAG AT T T GAT GAAG CTT 67 8 0 
TAGAGGATGC CAGAAG CAGT TAT AT GAAT G ATTTGAGTCG TT CAT T GAGT CTCGTTCTTG 684 0 
ATGAATTCTA TTGCGGACTG AAAAC AG GT T TTTATTCGAA AT AAAAC CTT TTTTAAATAA 6900 
TAAATTTCAG TTTGCGTCAG TTCTGCAACT GGAGAAGGAT TCGAAGATGT AAT GAC AG C A 6960 
AT C GAT GAAA GTGTTGAAGC ATACAAAAAA GAAT AT GT T C CAATGTATGA AAAAGT GTT G 7020 
GCTGAGAAAA AACTATTGGA T GAG GAG GAG AGAAAGAAAA GAGATGAAGA GGTAATTGTA 7080 
GTAATTTAAT TCTGATTATC TTCAAATTTT CAGACTCTGA AAGG AAAAG C T GTT CAC GAC 7140 
CTGAACAAAG TCGCCAATCC C GAC GAATT T CTGGAGTC GG AGTTGAATTC AAAAAT C GAT 72 00 
AGAATTCATT TGGGCGGAGT CGATGAAGAG AAT GAGGAGG AT GCT GAACT C G AAAG AT C C 7260 
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TGATTTTCTT TTTGTTTTTG AATTTTTATT CTATTTTGAT CCCTGTTTAC TTCTTATTGT 732 0 

TCTCATTTTG TTGCGTTGTT T T AC AT T T T A CTCATTTTTG CATAAACTTG TTGCAAAAAT 7380 

CAATATAATT TTTGATCTGG AAATGGTTTT AAACCTTAAC CTTTCATATA TTAATAATTT 7 440 

TTTTTCAAAA AAACGTTCTA AAAAGGTTCC TCATTTTTTC AATATAGGAA ATTTTGAAGA 7500 

TCTTTTCCAA AAATGAGGTT CTTCGCTTGA AAAGCCAACA TTTAAAACCT TTTTTTTTCC 7560 

AGAAACCTAG TGGTTAATGT CTGAAAAGAC GTTCCACAAG GCACAGACCA TCCGTGCAAA 7620 

GGCATCCGGA GTGCCTTCAA TCGTCGAAGC TGTACAGTTT CATGGAGTTC GCAT CACAAA 768 0 

AAACGATGCT TTGGTTAAGG AGGTACTACC CAAATTTCAA AATGTTGCAC AATTCAATTG 77 40 

AAAATATAAA TTGTGAATTA AATTCAACTT ACATGTTTTT TCAGGTTTCC GAATTATACA 78 00 

GAAGTAAAAA TCTAGATGAA CTTGTTCATA ACTCTCATCT GGCGGCTCGT CATCTTCAAG 7 8 60 

AAGTTGGATT AAT GGATAAT GCAGTTGCTC TAATT GAT AC ATCTCCAAGC TCAAATGAAG 7920 

GAT AT GT T GT CAATTTCCTA GTTCGAGAAC C AAAAT CAT T CACTGCTGGA GTCAAAGCAG 7 980 

GAGTT TCAAC G AAT G GAG AT GCGGATGTCA GTTTAAATGC C G GAAAACAA AGTGTTGGAG 8 04 0 

GAC GAG GAGA GG C AAT C AAT ACACAGTATA CAT AT AC T G T AAAGGTAAGG AC GAGAGTT G 8100 

GCACT GCCAG TTTGGCATGT TCTCCCAATA TTTTTTAATT ATAAAATTTG GAAGTATAAA 8160 

AAAATGTTTG CTTCATCTAA AAATAGCCTT TTTCACATGA AAAAAATTGA AAAAAAGTGC 8220 

TCAAAAATTT CAGAAATTTC CAATTTCCAA ACAATTTTGG AGAACTTTCA AAAATTTTTC 8280 

CAACTGAAAT TAAAGCTATA TTCTATCACT AAATTTTATA CAAGTCTTAA GAGAAAATGA 8340 

TGAAGTGGCT CATTTTGTAG AATTTCCTAA AAAATAATAT CTTCAGGGCG ATCACTGCTT 8400 

CAACATTTCC GCAAT CAAAC CATTCCTGGG ATGGCAAAAA TATTCGAATG TATCAGCGAC 84 60 

T CT AT AC C GT TCACTTGCAC ATATGCCATG G AAT C AAT C A GAT GT T GAT G AGAAT G C AGC 8 52 0 

TGTTCTTGCA TATAAT GGAC AACTATGGAA TCAAAAGCTT TTGCATCAAG TCAAATTGAA 8580 

TGCGGTAAAG TATTATAAGT GTTTTGTCCA AACTATGATA CAGTTCTTCA GAT AT G GAGA 8640 

ACACTT C GTG CCACT C GAGA TGCCGCATTT TCAGTTCGTG AACAAGCCGG AC AC ACT TT G 8700 

AAATTCTCGT TGGAGAATGC TGTAGCTGTT GATACAAGAG ATAGACCTAT TCTTGCAAGT 87 60 

CGTGGAATTC TTGGTAAGAG T AACAAC GAC TATTTTTAAA AAATATCTTT TTCGAAAAAA 8820 

TTACGAACGA AAAAAAACTG TATTAT GT AC CCAAACGCGA AATTTTGCAG TTCTTGCGCG 88 80 

TTCTTGTTGA TAAAAAATAT GTAAAAAATT GGAAAAACTA CGAAAAGTCG ATAAAAATTC 8940 

CGTACCAACC GGAAAATGTT TCATTAATTT CTCTTCCTTT TTTCAGCTCG TTTTGCTCAA 9000 

GAGTACGCAG GAGTATTTGG TGATGCGTCA TTTGTGAAGA ATACATTAGA TTTACAGGTA 9060 

ACAACCTTAT TTCAACAATT ATTTCAAATT C TAT T AAAAA TAATTCCAGG CAGCTGCCCC 9120 

TCTTCCACTC GGTTTCATTC TTGCCGCCTC ATTCCAAGCG AAACATTTGA AAGGACTCGG 9180 

AG AT C G AG AA GTTCATATTT T G GAT AG AT G TTATTTGGGT GGACAACAGG ATGTTCGAGG 9240 

ATTTGGTCTG AATACTATTG GAGT GAGTT T TAACGAAATT CTCTTGAAAG TCAAATAATC 9300 

ATTTTCAGGT TAAAGCAGAT AACAGTTGTC TTGGAGGAGG TGCTTCACTT GCTGGTGTCG 9360 

TTCATTTGTA TCGGCCATTG AT T C C AC C AA ATAT GCTATT TGCACACGCA TTCCTTGCAT 9420 

CTGGAAGTGT TGCATCAGTT CAT T C C AAAA ATTTGGTGCA ACAAT TACAG GATACTCAAC 94 80 

GAGT AT C AG C CGGATTTGGT GAGTTT GAAA TTTAGGAAAC AT TT G GAT G A AATGTATTTT 954 0 

T T AAAAAT AG ATCAGCTTTA TTTATTTGAA AAAAAACGCT CATTAATCAA TAGTGATAGT 9600 

TCCATTCTGA GTTTCTTCTT CTTCCTCGCG GAATACAATT TTTGACTTGT TCGCATCCTT 9660 

CTTGTGTACT TTGTCACCAA TCTTCTCATC AACTAAATCT CGAAACTGAA AAAATTTCAA 9720 

AATTATTCCA AAAAAT ATT G AT GCAGACTA CCTTTTTGAT GGCTTCTGGT ACGTTTCTAG 97 8 0 

CGTCGAATGG ATTGGCTCCT CCAATAATTA AAGTCTCGTT CGGTAGTTTA GCCAGACGGA 984 0 

CGGTGTGCTT CAACATTTTT CTAATT AAT C TATTTCAATT C AAGT C ACT C ACTCTCTCTT 9900 

GACGTCTTCT TCTATATTCC AAGAACTCTG CAGAAAATCC GTGTCCGCCT TGTGTGTTTC 9960 

TAGTTGGCGT CGGAGGATTC ACGGGTCCAA GACGAATGGA TGTCTAAAAA AT GTTATATT 10020 

TTTGCATAAA GAAAACACCA TACCTTCACC ACTTTTTGAG TTGTGGGCGT TCTGAATGGA 10080 

ATTGATCGAT TATTATTGCT CTTTCTTGAT TTGCTTCTAT CAGCTGCGTA AT GAG GT GT T 1014 0 

C T AAAGAT C A GCTTTAATTC ATTTGGACAA GTGCTCCTCT AATAAACTTA CCCTGTACTC 10200 

ATTTTTGAAA CGATTTACGA TGATAAGATT GAAAGTGGAA GTTAAATTTA GTCTTTCAAA 102 60 

GTTGAAATAA AAT CT T CAT A AATAAATAAA TTTAAATGAA AGATTAAATA AATTAACGTT 10320 

CACGTAGTTA AAAAAATAAT TTAAATCTTA ACTT CTAATA AAAAAT CT C A ATTTTCCAGG 1038 0 

ACTCGCATTC GTGTTCAAAA GTATTTTCCG GCTGGAACTC AACTACACGT AT C CAT T G AA 104 4 0 

ATATGTGCTC GGCGATTCAT TGCTCGGTGG ATT C CAT AT T GGAGCTGGTG TCAACTTCTT 10500 

GT AGAGAT T A ATTGGATGCA AGCACCCCTC AAAAAGATTT TTTTGAAAAA CGATAAATTC 10560 

ACAGAATTTC AGTTCTTTTT CTCCCCCTTT TAT T GT TAT T T T CAT C GT AA TGCTGTGCTA 10620 

GAAGT CAGAG T AAAT AT GAG TTTTTTTGTG TT CTAGGAAT TCCATTTTTT CAGGAAGCAA 10680 

AT T T AAT AAA AATT AT CGAA TTTCTTGCTC TAAAG AT GT T GTACATTTTA TGGAAATGTT 10740 

CGTATAGTAA T T C GAAC ACT TTATATTTCT C GT T TT AAAA CTGTCGGTGT TTTATAGTAA 108 00 
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ACTATCTTCA GAAAAAAATG AG C C T AC G AA AAATCAATTT CGTAACTGGA AACGTGAAGA 108 60 

AG C T T G AAG A AGTCAAGGCT ATTTTGAAGA ATTTCGAGGT AAAAT AT AT T T GAT AT TAT T 1092 0 

CGAACGCGAA ATTTTGCGCC AAAAGT AC GA TGCCTGGTCT CAACACGACA ATATTTTGTT 10980 

AAATACAAAC GAATGTGCGC CTTCAAAGAA AAGTTTCAAT CTTTCGTTGC CGTGGAGATA 11040 

TTTTTAGAGT TTTTGTTTAA ATTATATATT TGTCGTATCG AAACCGGGTA CCGTAATCAA 11100 

TCAATTAAAT ATTTTCAGGT TTCAAACGTG GAT GT C GAT T TGGATGAATT CCAAGGAGAA 11160 

CCCGAATTTA TTGCCGAAAG AAAGTGCCGT GAGGCTGTTG AAGC T GT AAA AGGGCCCGTT 11220 

TTGGTATGGA AAATTGTATT TGTTCTAAAA ATTGTCAAAT TTCAGGTCGA AGACACAAGT 11280 

TTATGCTTCA ACGCAATGGG CGGTCTTCCT GGACCTTATA TCAAGTGGTT TTTGAAGAAT 11340 

TTGAAACCAG AAGGACTACA TAATATGCTA GGTAAATATT TTAATTTTTT GAAAAAACTT 114 00 

ATTTTTCAGC CGGATTTTCT GACAAAACCG CCTATGCTCA AT GC AT C TT T G C GT AC ACT G 114 60 

AAGGACTCGG AAAACCTATT CAT GT ATT T G CTGGTAT GAT TTTTTGAATT TAATTCTTTA 11520 

ATTTTATATG TTAATTTAGT TGTTTCATTC CTCAATTTAT GAGAGATTTT TTTTTCAATT 11580 

TTTCTATTTC AG GAAAAT GT CCTGGTCAAA TTGTTGCTCC ACGTGGTGAT ACTGCTTTTG 11640 

GATGGGATCC ATGCTTCCAG CCAGATGGTT TTAAAGAAAC ATTC GGAGAA ATGGATAAAG 11700 

ATGTAAAAAA TGAAATTTCT CAT C GT G CAA AGGCTCTGGA AC T C CT C AAG GAATATTTTC 11760 

AGAATAATTA AATTATTTTT T CT CAT CTAT GCAATTTCTT GAAAATTTGT TAAGTTTCCG 11820 

TTGTTATGCA TTTGCTTTTA TTTAAAAAAA AAAGAATATT TTTACATTAA TATTAGATAT 11880 

GAGAAAAGAG TAATTTCTGG ATTTTAACCT TCCTACAAAA GAATATTTAT ATTTTTTGTA 1194 0 

TGATTTTTTA AAAAT AT C GT CAGGAAATAA TAACATTTCA GATATAC CCT GAACTCTACA 12000 

GTTTATGATA TTCAGGAAAT TTCTGAATTT TCTGAAACCT TACAAAATGC G A AC G GAT C C 12060 

GATTATTTTC GTGATTGGGT GCACTGGAAC C G G GAAAAGT GATCTTGGAG TGGCAATTGC 12120 

AAAGAAATAT GGAGGAGAGG T GAT TAG T GT AG ATT CAATG CAATTTTATA AAG GT AC AT G 1218 0 

GGTTTTGTTT CAATTTTAAA TTAATTAATT TTCGTTTTTC AGGACTTGAC ATTGCCACGA 1224 0 

ATAAGATAAC G G AAG AAG AA TCTGAAGGGA TTCAACATCA TAT GAT GT C A TTTTTGAATC 12300 

CAT C T GAAT C AT CAT CTT AT AAT GTACATA GTTTCCGAGA AGTCACGTTG GATCTTATTA 12360 

AAGTGCTTAA TTCGCGACTT TTTGAACTTG AT C CTAAT TT TCATAATTTT CAGAAAATCC 12420 

GCGCCCGTTC AAAAATTCCT GTAATTGTCG GAGGAACCAC TT ATT AT G C T GAAAGTGTCC 12480 

TTTAT GAGAA TAATCTGATT GAAAC CAAC A CTT CAGATGA CGTGGATTCC AAATCGAGAA 1254 0 

CATCATCAGA AT C GT CAT CT G AAG AC ACT G AAG AAG GAAT TAGTAATCAA GAAT TAT G G G 12 600 

AT GAAT T G AA AAAAAT CGAC GAAAAATCAG CACTTCTTCT ACATCCAAAT AATCGTTATC 12660 

GAGTACAGAG AGCATTGCAA AT T TT CAGAG AAACTGGTAA TTGATTTGCA AATTTCCAGA 12720 

TTAAAAACAA AT CAAGT AAA GTTTTTTGCA GGAATCCGAA AAAGTGAACT TGTTGAAAAA 12780 

C AG AAAT C AG AT GAAACT GT TGATTTGGGT GGACGACTAC GATTT GAT AA TTCTTTAGTT 12840 

ATTTTTATGG AT GC AAC AC C TGAAGTTTTA GAAGAAAGAC T T GAT GGAAG AGT T GAT AAA 12900 

AT GATTAAAT TGGGTTTGAA GAAT GAAT T G ATCGAGTTTT ATAACGAGGT AAAT AT TT GA 12960 

ATTTTTCCAG AAAAAAAAAG AAAATTTTTT ATT AT T T T GT TTTTTTTTCA TTCTTTACTA 13020 

TTTTCCAAAA AAGTTTAAAC TTTTGAAAAC TGTTCAGAAA ATGTTCGTGT ATTTATTTTA 13080 

G C T TACT GAG GCATTATTTC ATTGTGATTT TTACTATACT CTATAAACTA AAT TT T CAG C 13140 

ACGCCGAGTA CATAAATCAC AG C AAAT AT G GT GT CAT G C A ATGTATTGGT CTTAAAGAAT 13200 

TCGTTCCATG GCTCAATTTG GACCCAT CAG AAAGAGATAC ACT CAAT GGG GAT AAATT GT 13260 

TCAAGCAAGG GTAATTTAAA TTTATTTTCA AT TTTTAT AA ATTCCAAGCT ATTTTCAGAT 1332 0 

GC GAT GAT GT GAAGCTTCAC ACTCGACAAT ATGCACGGCG CCAGAGACGG TGGTATCGAT 13380 

CGAGACTTTT AAAACGGTCG GATGGTGATC GGGTATGTTG AT TT TAAAAA AATTGAATTT 1344 0 

TTAAAGAACT TTTTTACTAA ATTAACAAAG TTATTGGCTG AAAAT GG CTG AAAATTATAG 13500 

TAAAACTAAT CAAAAAAATT GAAATTTT GA ATTAAAGTCA TAAAGTGACG AC C AGAAAAT 13560 

TAAAAAAAAA CATTTTTCTA TTTTAATTAA TTCACTCTAC TTCACTTTAA AAAT AATTT T 13620 

CAGAAAATGG CAAGTACAAA AAT GC T G GAT ACATCTGACA AGT AC C GAAT AATTAGT GAT 1368 0 

GGAATGGACA TTGTTGATCA AT GGAT GAAT G GAAT C GAT C TATTTGAAGA TGTAAAATTT 137 4 0 

CACAAATT CT AAAATTTCCG AAT C AC AAAT TAAAATTT CT ACAGATCTCC ACAGACACCA 138 00 

ATCCAATTCT AAAAGGGTCC GAT G C AAAT A TTCTGCTGAA TTGTGAAATC T GT AAT ATT T 13860 

CAATGACTGG AAAAGATAAT TGGTTTGTTT CAATACATAT TAT AAT T T C G AAAT GAATT T 13920 

TTTCAGGCAG AAACATATCG ATGGGAAAAA GCACAAGCAT CAT GCTAAGC AAAAG AAATT 1398 0 

GG CAGAGAC T CGCACATAAG ACGCTATATT TATTTTTTGT T AACT T AAAT TATTTTTGTT 14 04 0 

GTTGATTGTT C T C T AAAT AA AAAAACAG CT CAGAGAGAAG ATTAGGCGCT CGTCCACATC 14100 

T C C G AC GAT A GT CAAC C C GA AC GAAG GGAA CTATCTTTAA TTGTCAGTGA TGACGTCATG 14160 

TCGTCAAGAA CTCGTCATAG CT GT GAGAAT TGAACCATTA TAGATTTGGA CATTAGTTTA 14220 

GGTTATATCC AGTACAC T AA AT G GT AC AT G AT AG AC AGT G TACATTTACA GATTTATAGA 14280 

TTGTCTCAGT GACTAGTTAC C G GAAG AG G A GAGGAGAACA TGTGGCGATG TCTTTTGGAT 1434 0 
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CGATATTATT CCGTCTGAAA ATTGTTCACT AGGGGGACTG C C GAT T AC C A C T T C AC AT GA 14400 
CGGAACATGT TAGTTAAAAT ATTGGCTTTT AT AC AC AT T T TCAAAATAGC ACCTGTAT 14 458 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 430 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 



Met 


He 


Phe 


Arg 


Lys 


Phe 


Leu 


Asn 


Phe 


Leu 


Lys 


Pro 


Tyr 


Lys 


Met 


Arg 


1 






el 
O 










i n 














Thr 


Asp 


Pro 


±-Le 


lie 


Phe 


Val 


lie 


Lj.Ly 


Cys 


Tnr 




'V V-i r~ 




Lys 


oer 


















25 










30 






Asp 


Leu 


r*l t» 


var 


Til n 


lie 


AT a 

/\-L a 


±jy 5 


Lys 


lyr 


Gly 


Gly 


Glu 


Val 


He 


Ser 




o o 










40 










45 








v ax 


Asp 


Ser 




uin 


r ne 


Tyr 


Lys 


Gly 


Leu 


sp 


He 


Ala 


Thr 


Asn 


Lys 




•J V 










55 










60 










lie 


Thr 


Glu 


Glu 


Glu 


Ser 


Glu 


Gly 


lie 


Gin 


His 


His 


Met 


Met 


Ser 


Phe 


65 










70 








75 










80 


Leu 


Asn 


Pro 


Ser 


Glu 


Ser 


Ser 


Ser 


Tyr 


Asn 


Val 


His 


Ser 


Phe 


Arg 


Glu 










85 










90 










95 




Val 


Thr 


Leu 


Asp 


Leu 


He 


Lys 


Lys 


He 


Arg 


Ala 


Arg 


Ser 


Lys 


He 


Pro 








100 










105 










110 






Val 


He 


Val 


Gly 


Gly 


Thr 


Thr 


Tyr 


Tyr 


Ala 


Glu 


Ser 


Val 


Leu 


Tyr 


Glu 






115 










120 










125 








Asn 


Asn 


Leu 


He 


Glu 


Thr 


Asn 


Thr 


Ser 


Asp 


Asp 


Val 


Asp 


Ser 


Lys 


Ser 




130 










135 










140 










Arg 


Thr 


Ser 


Ser 


Glu 


Ser 


Ser 


Ser 


Glu 


Asp 


Thr 


Glu 


Glu 


Gly 


He 


Ser 


145 










150 










155 










160 


Asn 


Gin 


Glu 


Leu 


Trp 


Asp 


Glu 


Leu 


Lys 


Lys 


He 


Asp 


Glu 


Lys 


Ser 


Ala 










165 










170 










175 




Leu 


Leu 


Leu 


His 


Pro 


Asn 


Asn 


Arg 


Tyr 


Arg 


Val 


Gin 


Arg 


Ala 


Leu 


Gin 








180 










185 










190 






He 


Phe 


Arg 


Glu 


Thr 


Gly 


He 


Arg 


Lys 


Ser 


Glu 


Leu 


Val 


Glu 


Lys 


Gin 






195 










200 










205 








Lys 


Ser 


Asp 


Glu 


Thr 


Val 


Asp 


Leu 


Gly 


Gly 


Arg 


Leu 


Arg 


Phe 


Asp 


Asn 




210 










215 










220 










Ser 


Leu 


Val 


He 


Phe 


Met 


Asp 


Ala 


Thr 


Pro 


Glu 


Val 


Leu 


Glu 


Glu 


Arg 


225 










230 








235 










240 


Leu 


Asp 


Gly 


Arg 


Val 


Asp 


Lys 


Met 


He 


Lys 


Leu 


Gly 


Leu 


Lys 


Asn 


Glu 








245 










250 










255 




Leu 


He 


Glu 


Phe 


Tyr 


Asn 


Glu 


His 


Ala 


Glu 


Tyr 


He 


Asn 


His 


Ser 


Lys 








260 










265 










270 






Tyr 


Gly 


Val 


Met 


Gin 


Cys 


He 


Gly 


Leu 


Lys 


Glu 


Phe 


Val 


Pro 


Trp 


Leu 






275 










280 










285 








Asn 


Leu 


Asp 


Pro 


Ser 


Glu 


Arg 


Asp 


Thr 


Leu 


Asn 


Gly 


Asp 


Lys 


Leu 


Phe 




290 










295 










300 










Lys 


Gin 


Gly 


Cys 


Asp 


Asp 


Val 


Lys 


Leu 


His 


Thr 


Arg 


Gin 


Tyr 


Ala 


Arg 


305 










310 










315 










320 


Arg 


Gin 


Arg 


Arg 


Trp 


Tyr 


Arg 


Ser 


Arg 


Leu 


Leu 


Lys 


Arg 


Ser 


Asp 


Gly 








325 










330 










335 
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Asp Arg Lys Met Ala Ser Thr Lys Met Leu Asp Thr Ser Asp Lys Tyr 

340 345 350 

Arg He He Ser Asp Gly Met Asp He Val Asp Gin Trp Met Asn Gly 

355 360 365 

He Asp Leu Phe Glu Asp He Ser Thr Asp Thr Asn Pro He Leu Lys 

370 375 380 

Gly Ser Asp Ala Asn He Leu Leu Asn Cys Glu He Cys Asn He Ser 
385 ~ 390 395 400 

Met Thr Gly Lys Asp Asn Trp Gin Lys His He Asp Gly Lys Lys His 

405 410 415 

Lys His His Ala Lys Gin Lys Lys Leu Ala Glu Thr Arg Thr 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2041 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 

CTGCCATAAG ATGGCGTCCG TGGCGGCTGC AC GAG CAGT T CCTGTGGGCA GTGGGCTCAG 60 

GGGCCTGCAA CGGACCCTAC CTCTTGTAGT GATTCTCGGG GCCACGGGCA CCGGCAAATC 120 

CACGCTGGCG TTGCAGCTAG GCCAGCGGCT CGGCGGTGAG ATCGTCAGCG CTGACTCCAT 18 0 

GCAGGTCTAT GAAGGCCTAG ACATCATCAC CAACAAGGTT TCTGCCCAAG AGCAGAGAAT 24 0 

CTGCCGGCAC CAC AT GAT C A GCTTTGTGGA TCCTCTTGTG AC CAAT T AC A CAGTGGTGGA 300 

CTTCAGAAAT AGAGCAACTG CTCTGATTGA AGATATATTT GCCCGAGACA AAATTCCTAT 360 

TGTTGTGGGA GGAACCAATT AT T AC AT T G A AT CTCTGCTC TGGAAAGTTC T T GT CAAT AC 42 0 

CAAGCCCCAG GAGATGGGCA CTGAGAAAGT GAT T G AC C GA AAAGTGGAGC TTGAAAAGGA 4 80 

GGATGGTCTT GTACTTCACA AACGCCTAAG CCAGGTGGAC CCAGAAATGG CTGCCAAGCT 540 

GCATCCACAT GACAAAC G C A AAGTGGCCAG GAGCTTGCAA GTTTTTGAAG AAACAGGAAT 600 

CTCTCATAGT GAATTTCTCC ATCGTCAACA T AC GGAAGAA GGTGGTGGTC CCCTTGGAGG 660 

TCCTCTGAAG TTCTCTAACC CTTGCATCCT TTGGCTTCAT GCTGACCAGG CAGTTCTAGA 720 

TGAGCGCTTG GATAAGAGGG T G GAT GAC AT GCTTGCTGCT GGGCTCTTGG AGGAACTAAG 7 80 

AGATTTTCAC AGAC GCTATA ATCAGAAGAA TGTTTCGGAA AATAGCCAGG ACTATCAACA 84 0 

TGGTATCTTC CAAT CAATTG GCTTCAAGGA ATTTCACGAG T AC CT GAT CA CTGAGGGAAA 900 

AT GCACACT G GAGACTAGTA AC C AG CT T CT AAAGAAAGGA CCTGGTCCCA TTGTCCCCCC 960 

TGTCTATGGC T T AGAGGT AT CTGATGTCTC GAAGTGGGAG GAGTCTGTTC TTGAACCTGC 1020 

TCTTGAAATC GTGCAAAGTT TCATCCAGGG CCACAAGCCT ACAGCCACTC CAATAAAGAT 1080 

GCCATACAAT GAAGCT GAGA ACAAGAGAAG TT AT CAC CT G TGTGACCTCT GTGATCGAAT 1140 

CAT C ATT GGG GATCGCGAAT GGGCAGCGCA CAT AAAAT C C AAATCCCACT T GAAC CAACT 1200 

GAAGAAAAGA AGAAGATTGG ACTCAGATGC TGTCAACACC AT AGAAAGT C AGAGTGTTTC 1260 

CCCAGACTAT AACAAAGAAC CTAAAGGGAA GGGATCCCCA GGGCAGAATG AT CAAGAGCT 1320 

GAAATGCAGC GTTTAAGAGA CAT G T C CAGT GGCCTTTGGA AAGGTGGTGG G GAT C CAGT T 13 80 

CAGGAGGGAG GGGTATGTTT GTCTCCCAGT CTGGGCAAAG GAGTGCTATG CGGAATTCTC 144 0 

TGCATAGCAG AAAAGCTCCC ACCATTTTCT TTTGATGTGG TTTTAAAGTC TCACGTTCTC 1500 

TATAATAGAA ACAGCAG GT C TTGTCAGCTC CTTGTGTGGC TGATGTGTCT GGAAATGATG 1560 

TAGTTCAGGA AAGCATTTTT TTTTTCTTTG AACCTTAAAG GTTCTATTAT TAAAAGCAGC 1620 

ACAGATTCCA CATTTTTATA CATGAGGATC TTCTTTGTGG TGAATACCAG GATTGACTGC 1680 

ATCCCTTTAA AAGAAGT T TT ATGTCCCTGA CTCTGGCTAA AATTATCTAA TTTCCAGATG 1740 

CTTTT GTAGA TGACT GAAGT ATTTGTGAGC CACATATTGG GAGT T C TAG A TTTGAGTGAA 1800 

TGGCAGGAAA GGGCCATCTC CATTGAGATG ATTAAGT GAA CCAAACTAGT TCTCGGAATT I8 60 

CTACAGAGAA GGAGGGAATC AGACT GAG GA AGCTGTGACA TAG GAC T T G A AGACCAAAGA 1920 

CTTTGAAATT TGCGAGCTGC TCATGTGTGA GTTATTATCA CTGCTGTCTT TCTATTGAGT 1980 
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TAC AAAT CT A TATTTTTATT GAAGTTTAAA TAAAGAAAAA ATTTACAAGA AAAAAAAAAA 2 04 0 

2 041 



A 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 892 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met Phe Arg Lys Leu Gly Ser Ser Gly Ser Leu Trp Lys Pro Lys Asn 

1 5 10 15 

Pro His Ser Leu Glu Tyr Leu Lys Tyr Leu Gin Gly Val Leu Thr Lys 

20 25 30 

Asn Glu Lys Val Thr Glu Asn Asn Lys Lys lie Leu Val Glu Ala Leu 

35 40 45 

Arg Ala lie Ala Glu He Leu He Trp Gly Asp Gin Asn Asp Ala Ser 

50 55 60 

Val Phe Asp Phe Phe Leu Glu Arg Gin Met Leu Leu Tyr Phe Leu Lys 
65 70 75 80 

He Met Glu Gin Gly Asn Thr Pro Leu Asn Val Gin Leu Leu Gin Thr 

85 90 95 

Leu Asn He Leu Phe Glu Asn He Arg His Glu Thr Ser Leu Tyr Phe 

100 105 HO 

Leu Leu Ser Asn Asn His Val Asn Ser He He Ser His Lys Phe Asp 

115 120 125 

Leu Gin Asn Asp Glu He Met Ala Tyr Tyr He Ser Phe Leu Lys Thr 

130 135 140 

Leu Ser Phe Lys Leu Asn Pro Ala Thr He His Phe Phe Phe Asn Glu 
145 150 155 160 

Thr Thr Glu Glu Phe Pro Leu Leu Val Glu Val Leu Lys Leu Tyr Asn 

165 170 175 

Trp Asn Glu Ser Met Val Arg He Ala Val Arg Asn He Leu Leu Asn 

180 185 190 

He Val Arg Val Gin Asp Asp Ser Met He He Phe Ala He Lys His 

195 200 205 

Thr Lvs Glu Tyr Leu Ser Glu Leu He Asp Ser Leu Val Gly Leu Ser 

210 215 220 

Leu Glu Met Asp Thr Phe Val Arg Ser Ala Glu Asn Val Leu Ala Asn 
225 230 235 240 

Arg Glu Arg Leu Arg Gly Lys Val Asp Asp Leu He Asp Leu He His 

245 ' 250 255 

Tvr He Gly Glu Leu Leu Asp Val Glu Ala Val Ala Glu Ser Leu Ser 

260 265 270 

He Leu Val Thr Thr Arg Tyr Leu Ser Pro Leu Leu Leu Ser Ser He 

275 280 285 

Ser Pro Arg Arg Asp Asn His Ser Leu Leu Leu Thr Pro He Ser Ala 

290 " 295 300 

Leu Phe Phe Phe Ser Glu Phe Leu Leu He Val Arg His His Glu Thr 
305 310 315 320 

He Tyr Thr Phe Leu Ser Ser Phe Leu Phe Asp Thr Gin Asn Thr Leu 
325 330 335 
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Thr Thr His Trp He Arg His Asn Glu Lys Tyr Cys Leu Glu Pro He 

340 345 350 

Thr Leu Ser Ser Pro Thr Gly Glu Tyr Val Asn Glu Asp His Val Phe 

355 360 365 

Phe Asp Phe Leu Leu Glu Ala Phe Asp Ser Ser Gin Ala Asp Asp Ser 

370 375 380 

Lys Ala Phe Tyr Gly Leu Met Leu He Tyr Ser Met Phe Gin Asn Asn 
385 390 395 400 

Ala Asp Val Gly Glu Leu Leu Ser Ala Ala Asn Phe Pro Val Leu Lys 

405 410 415 

Glu Ser Thr Thr Thr Ser Leu Ala Gin Gin Asn Leu Ala Arg Leu Arg 

420 425 430 

He Ala Ser Thr Ser Ser He Ser Lys Arg Thr Arg Ala lie Thr Glu 

435 440 445 

He Gly Val Glu Ala Thr Glu Glu Asp Glu He Phe His Asp Val Pro 

450 455 460 

Glu Glu Gin Thr Leu Glu Asp Leu Val Asp Asp Val Leu Val Asp Thr 
465 470 475 480 

Glu Asn Ser Ala He Ser Asp Pro Glu Pro Lys Asn Val Glu Ser Glu 

485 490 495 

Ser Arg Ser Arg Phe Gin Ser Ala Val Asp Glu Leu Pro Pro Pro Ser 

500 505 510 

Thr Ser Gly Cys Asp Gly Arg Leu Phe Asp Ala Leu Ser Ser He He 

515 520 525 

Lys Ala Val Gly Thr Asp Asp Asn Arg He Arg Pro He Thr Leu Glu 

530 535 540 

Leu Ala Cys Leu Val He Arg Gin He Leu Met Thr Val Asp Asp Glu 
545 550 555 560 

Lys Val His Thr Ser Leu Thr Lys Leu Cys Phe Glu Val Arg Leu Lys 

565 570 575 

Leu Leu Ser Ser He Gly Gin Tyr Val Asn Gly Glu Asn Leu Phe Leu 

580 * 585 590 

Glu Trp Phe Glu Asp Glu Tyr Ala Glu Phe Glu Val Asn His Val Asn 

595 600 605 

Phe Asp lie He Gly His Glu Met Leu Leu Pro Pro Ala Ala Thr Pro 

610 615 620 

Leu Ser Asn Leu Leu Leu His Lys Arg Leu Pro Ser Gly Phe Glu Glu 
625 630 635 640 

Arg He Arg Thr Gin He Val Phe Tyr Leu His He Arg Lys Leu Glu 

645 650 655 

Arg Asp Leu Thr Gly Glu Gly Asp Thr Glu Leu Pro Val Arg Val Leu 

660 665 670 

Asn Ser Asp Gin Glu Pro Val Ala He Gly Asp Cys He Asn Leu His 

675 680 685 

Asn Ser Asp Leu Leu Ser Cys Thr Val Val Pro Gin Gin Leu Cys Ser 

690 695 700 

Leu Gly Lys Pro Gly Asp Arg Leu Ala Arg Phe Leu Val Thr Asp Arg 
705 710 715 720 

Leu Gin Leu He Leu Val Glu Pro Asp Ser Arg Lys Ala Gly Trp Ala 

725 730 735 

He Val Arg Phe Val Gly Leu Leu Gin Asp Thr Thr He Asn Gly Asp 

740 745 750 

Ser Thr Asp Ser Lys Val Leu His Val Val Val Glu Gly Gin Pro Ser 

755 ~ 760 765 

Arg He Lys Lys Arg His Pro Val Leu Thr Ala Lys Phe He Phe Asp 

770 775 780 

Asp His He Arg Cys Met Ala Ala Lys Gin Arg Leu Thr Lys Gly Arg 
785 790 795 800 
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Gin 


Thr 


Ala Arg 


Gly 


Leu 


Lys 


Leu 


Gin 


Ala 


He 


Cys 


Ser 


Ala 


Leu 


Gly 










805 










810 










815 


Val 


Pro 


Arg 


He 
820 


Asp 


Pro 


Ala 


Thr 


Met 
825 


Thr 


Ser 


Ser 


Pro 


Arg 
830 


Met 


Asn 


Pro 


Phe 


Arg 
835 


He 


Val 


Lys 


Gly 


Cys 
840 


Ala 


Pro 


Gly 


Ser 


Val 
845 


Arg 


Lys 


Thr 


Val 


Ser 
850 


Thr 


Ser 


Ser 


Ser 


Ser 
855 


Ser 


Gin 


Gly 


Arg 


Pro 
860 


Gly 


His 


Tyr 


Ser 


Ala 


Asn 


Leu 


Arg 


Ser 


Ala 


Ser 


Arg 


Asn 


Ala 


Gly 


Met 


He 


Pro 


Asp 


Asp 


865 










870 










875 










880 


Pro 


Thr 


Gin 


Pro 


Ser 
885 


Ser 


Ser 


Ser 


Glu 


Arg 
890 


Arg 


Ser 











(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 355 amino acids 

(B) TYPE; amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 



Met 


Ala 


Glu 


Lys 


Ala 


Glu 


Asn 


Leu 


Pro 


Ser 


Ser 


Ser 


Ala 


Glu 


Ala 


Ser 


1 






5 










10 










15 




Glu 


Glu 


Pro 


Ser 
20 


Pro 


Gin 


Thr 


Gly 


Pro 
25 


Asn 


Val 


Asn 


Gin 


Lys 
30 


Pro 


Ser 


He 


Leu 


Val 
35 


Leu 


Gly 


Met 


Ala 


Gly 
40 


Ser 


Gly 


Lys 


Thr 


Thr 
45 


Phe 


Val 


Gin 


Arg 


Leu 

50 


Thr 


Ala 


Phe 


Leu 


His 
55 


Ala 


Arg 


Lys 


Thr 


Pro 
60 


Pro 


Tyr 


Val 


He 


Asn 


Leu 


Asp 


Pro 


Ala 


Val 


Ser 


Lys 


Val 


Pro 


Tyr 


Pro 


Val 


Asn 


Val 


Asp 


65 










70 










75 










80 


He 


Arg 


Asp 


Thr 


Val 
85 


Lys 


Tyr 


Lys 


Glu 


Val 
90 


Met 


Lys 


Glu 


Phe 


Gly 
95 


Met 


Gly 


Pro 


Asn 


Gly 
100 


Ala 


He 


Met 


Thr 


Cys 
105 


Leu 


Asn 


Leu 


Met 


Cys 
110 


Thr 


Arg 


Phe 


Asp 


Lys 
115 


Val 


He 


Glu 


Leu 


He 
120 


Asn 


Lys 


Arg 


Ser 


Ser 
125 


Asp 


Phe 


Ser 


Val 


Cys 
130 


Leu 


Leu 


Asp 


Thr 


Pro 
135 


Gly 


Gin 


He 


Glu 


Ala 
140 


Phe 


Thr 


Trp 


Ser 


Ala 


Ser 


Gly 


Ser 


He 


He 


Thr 


Asp 


Ser 


Leu 


Ala 


Ser 


Ser 


His 


Pro 


Thr 


145 








150 










155 










160 


Val 


Val 


Met 


Tyr 


He 


Val 


Asp 


Ser 


Ala 


Arg 


Ala 


Thr 


Asn 


Pro 


Thr 


Thr 








165 










170 










175 




Phe 


Met 


Ser 


Asn 
180 


Met 


Leu 


Tyr 


Ala 


Cys 
185 


Ser 


He 


Leu 


Tyr 


Arg 
190 


Thr 


Lys 


Leu 


Pro 


Phe 
195 


He 


Val 


Val 


Phe 


Asn 
200 


Lys 


Ala 


Asp 


He 


Val 
205 


Lys 


Pro 


Thr 


Phe 


Ala 
210 


Leu 


Lys 


Trp 


Met 


Gin 
215 


Asp 


Phe 


Glu 


Arg 


Phe 
220 


Asp 


Glu 


Ala 


Leu 


Glu 


Asp 


Ala 


Arg 


Ser 


Ser 


Tyr 


Met 


Asn 


Asp 


Leu 


Ser 


Arg 


Ser 


Leu 


Ser 


225 










230 










235 










240 


Leu 


Val 


Leu 


Asp 


Glu 
245 


Phe 


Tyr 


Cys 


Gly 


Leu 
250 


Lys 


Thr 


Val 


Cys 


Val 
255 


Ser 
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Ser 


Ala 


Thr 


Gly 


Glu 


Gly Phe 


Glu 


Asp 


Val 


Met 


Thr 


Ala 


He 


Asp 


Glu 








260 








265 










270 






Ser 


Val 


Glu 


Ala 


Tyr 


Lys Lys 


Glu 


Tyr 


Val 


Pro 


Met 


Tyr 


Glu 


Lys 


Val 






275 








280 










285 








Leu 


Ala 


Glu 


Lys 


Lys 


Leu Leu 


Asp 


Glu 


Glu 


Glu 


Arg 


Lys 


Lys 


Arg 


Asp 




290 








295 










300 










Glu 


Glu 


Thr 


Leu 


Lys 


Gly Lys 


Ala 


Val 


His 


Asp 


Leu 


Asn 


Lys 


Val 


Ala 


305 










310 








315 










320 


Asn 


Pro 


Asp 


Glu 


Phe 


Leu Glu 


Ser 


Glu 


Leu 


Asn 


Ser 


Lys 


He 


Asp 


Arg 








325 








330 










335 




lie 


His 


Leu 


Gly 


Gly Val Asp 


Glu 


Glu 


Asn 


Glu 


Glu Asp 


Ala 


Glu 


Leu 








340 








345 










350 







Glu Arg Ser 
355 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 434 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 



Met 


Ser 


Glu 


Lys 


Thr 


Phe 


His 


Lys 


Ala 


Gin 


Thr 


He 


Arg 


Ala 


Lys 


Ala 


1 






5 










10 










15 




Ser 


Gly 


Val 


Pro 


Ser 


He 


Val 


Glu 


Ala 


Val 


Gin 


Phe 


His 


Gly Val 


Arg 






20 










25 










30 






He 


Thr 


Lys 


Asn 


Asp 


Ala 


Leu 


Val 


Lys 


Glu 


Val 


Ser 


Glu 


Leu 


Tyr 


Arg 






35 










40 










45 








Ser 


Lys 


Asn 


Leu 


Asp 


Glu 


Leu 


Val 


His 


Asn 


Ser 


His 


Leu 


Ala 


Ala 


Arg 




50 










55 










60 










His 


Leu 


Gin 


Glu 


Val 


Gly 


Leu 


Met 


Asp 


Asn 


Ala 


Val 


Ala 


Leu 


He 


Asp 


65 










70 










75 










80 


Thr 


Ser 


Pro 


Ser 


Ser 


Asn 


Glu 


Gly 


Tyr 


Val 


Val 


Asn 


Phe 


Leu 


Val 


Arg 










85 










90 










95 




Glu 


Pro 


Lys 


Ser 


Phe 


Thr 


Ala 


Gly 


Val 


Lys 


Ala 


Gly 


Val 


Ser 


Thr 


Asn 






100 










105 










110 






Gly 


Asp 


Ala 


Asp 


Val 


Ser 


Leu 


Asn 


Ala 


Gly 


Lys 


Gin 


Ser 


Val 


Gly 


Gly 


115 










120 










125 








Arg 


Gly 


Glu 


Ala 


He 


Asn 


Thr 


Gin 


Tyr 


Thr 


Tyr 


Thr 


Val 


Lys 


Gly 


Asp 


130 










135 










140 










His 


Cys 


Phe 


Asn 


lie 


Ser 


Ala 


He 


Lys 


Pro 


Phe 


Leu 


Gly 


Trp 


Gin 


Lys 


145 








150 










155 










160 


Tyr 


Ser 


Asn 


Val 


Ser 


Ala 


Thr 


Leu 


Tyr 


Arg 


Ser 


Leu 


Ala 


His 


Met 


Pro 








165 










170 










175 




Trp 


Asn 


Gin 


Ser 


Asp 


Val 


Asp 


Glu 


Asn 


Ala 


Ala 


Val 


Leu 


Ala 


Tyr 


Asn 






180 










185 










190 






Gly 


Gin 


Leu 


Trp 


Asn 


Gin 


Lys 


Leu 


Leu 


His 


Gin 


Val 


Lys 


Leu 


Asn 


Ala 




195 








200 










205 








He 


Trp 


Arg 


Thr 


Leu 


Arg 


Ala 


Thr 


Arg 


Asp 


Ala 


Ala 


Phe 


Ser 


Val 


Arg 




210 










215 










220 










Glu 


Gin 


Ala 


Gly 


His 


Thr 


Leu 


Lys 


Phe 


Ser 


Leu 


Glu 


Asn 


Ala 


Val 


Ala 


225 








230 








235 










240 
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vai 


Asp 


Thr 


Arg 


Asp 


Arg 


Pro 


He 










245 








Arg 


Phe 


Ala 


Gin 


Glu 


Tyr Ala 


Gly 






260 










Lys 


sn 


Thr 


Leu 


Asp 


Leu 


Gin 


Ala 




275 










280 


lie 


Leu 


Ala 


Ala 


Ser 


Phe 


Gin 


Ala 




290 










295 




Arg 


Glu 


Val 


His 


He 


Leu Asp Arg 


305 










310 






Val 


Arg 


Gly 


Phe 


Gly 


Leu 


Asn 


Thr 










325 








Cys 


Leu 


Gly Gly 


Gly Ala 


Ser 


Leu 








340 










Pro 


Leu 


He 


Pro 


Pro 


Asn 


Met 


Leu 






355 










360 


Gly 


Ser 


Val 


Ala 


Ser 


Val 


His 


Ser 


370 










375 




Asp 


Thr 


Gin 


Arg 


Val 


Ser 


Ala 


Gly 


385 










390 






Ser 


lie 


Phe 


Arg 


Leu 


Glu 


Leu 


Asn 










405 








Leu 


Gly 


Asp 


Ser 


Leu 


Leu 


Gly 


Gly 








420 










Phe 


Leu 















Leu 


Ala 
250 


Ser 


Ara 


Gly 


He 


Leu 
255 


Ala 


Val 


Phe 


Gly 


Asp 


Ala 


Ser 


Phe 


Val 


265 










270 






Ala 


Ala 


Pro 


Leu 


Pro 
285 


Leu 


Gly 


Phe 


Lys 


His 


Leu 


Lys 
300 


Gly 


Leu 


Gly 


Asp 


Cys 


Tyr 


Leu 


Gly 


Gly 


Gin 


Gin 


Asp 


315 










320 


He 


Gly Val 


Lys 


Ala 


Asp Asn 


Ser 




330 










335 




Ala 


Gly Val 


Val 


His 


Leu 


Tyr 


Arg 


345 










350 






Phe 


Ala 


His 


Ala 


Phe 


Leu Ala 


Ser 










365 








Lys 


Asn 


Leu 


Val 


Gin 


Gin 


Leu 


Gin 






380 










Phe 


Gly Leu 


Ala 


Phe 


Val 


Phe 


Lys 






395 










400 


Tyr 


Thr 
410 


Tyr 


Pro 


Leu 


Lys 


Tyr 
415 


Val 


Phe 


His 


He 


Gly Ala 


Gly 


Val 


Asn 


425 










430 







(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 198 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 





(ii) MOLECULE 


TYPE: protein 


















(xi) SEQUENCE 


DESCRIPTION: SEQ ID 


NO: 7 












Met 


Leu 


Tyr 


He 


Leu 


Trp 


Lys 


Leu Asn 


Tyr 


Leu 


Gin 


Lys 


Lys 


Met 


Ser 


1 






5 








10 










15 




Leu 


Arg 


Lys 


He 


Asn 


Phe 


Val 


Thr Gly 


Asn 


Val 


Lys 


Lys 


Leu 


Glu 


Glu 




20 








25 










30 






Val 


Lys 


Ala 


He 


Leu 


Lys 


Asn 


Phe Glu 


Val 


Ser 


Asn 


Val 


Asp 


Val 


Asp 




35 








40 








45 








Leu Asp 


Glu 


Phe 


Gin 


Gly 


Glu 


Pro Glu 


Phe 


He 


Ala 


Glu 


Arg 


Lys 


Cys 




50 










55 








60 








Thr 


Arg 


Glu 


Ala 


Val 


Glu 


Ala 


Val 


Lys Gly 


Pro 


Val 


Leu 


Val 


Glu 


Asp 


65 










70 








75 










80 


Ser 


Leu 


Cys 


Phe 


Asn 


Ala 


Met 


Gly Gly 


Leu 


Pro 


Gly 


Pro 


Tyr 


He 


Lys 








85 








90 










95 


Ala 


Trp 


Phe 


Leu 


Lys 


Asn 


Leu 


Lys 


Pro Glu 


Gly Leu 


His 


Asn 


Met 


Leu 






100 






105 










110 






Gly 


Phe 


Ser 


Asp 


Lys 


Thr 


Ala 


Tyr Ala 


Gin 


Cys 


He 


Phe 


Ala 


Tyr 


Thr 




115 










120 








125 






Gly 


Glu 


Gly 


Leu 


Gly 


Lys 


Pro 


He 


His Val 


Phe 


Ala 


Gly 


Lys 


Cys 


Pro 




130 






135 








140 










Gin 


He 


Val 


Ala 


Pro 


Arg 


Gly 


Asp Thr 


Ala 


Phe 


Gly Trp Asp 


Pro 


Cys 


145 










150 




155 










160 
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Phe Gin Pro Asp Gly Phe Lys Glu Thr Phe Gly Glu Met Asp Lys Asp 

165 170 175 

Val Lys Asn Glu lie Sec His Arg Ala Lys Ala Leu Glu Leu Leu Lys 

180 185 190 

Glu Tyr Phe Gin Asn Asn 
195 

(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 
{C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
C GAACACT TT ATATTTCTCG 20 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
GATAGTTCCC TTCGTTCGGG 20 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
TTTCTGGATT TTAACCTTCC 20 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNES S : s ingl e 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
TTTCCGAGAA GTCACGTTGG 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY; linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
TACAGGAATT TTTGAACGGG 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
CTTCAGATGA CGTGGATTCC 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO; 14: 
GGAATCCGAA AAAGT GAACT 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
AAGAGATACA CTCAATGGGG 



20 
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(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:16: 
AT C GAT AC C A CCGTCTCTGG 



(2) INFORMATION FOR SEQ ID NO: 17: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
TTGAATCTAC ACTAATCACC 



(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
CCAATTATCT TTTCCAGTCA 



(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 



AC AT TAT AAA GTTACTGTCC 
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(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20 
TTTTAGTTAA AGCATTGACC 



(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21 
ACATCTTTAT CCATTTCTCC 



(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 22 
TGCAAAGGCT CTGGAACTCC 



(2) INFORMATION FOR SEQ ID NO: 23:. 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 20 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23 
AAAAAC CACT T GATAT AAGG 
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(2) INFORMATION FOR SEQ ID NO:24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24 
CAT C C AAAAG CAGTAT C AC C 



(2) INFORMATION FOR SEQ ID NO:25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:25 
TTAATTGGAT GCAAGCACCC C 



(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26 
ATTACTATAC GAACATTTCC 



(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27 
TTGTAAAGGC GTTAGTTTGG 
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(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



{xi) SEQUENCE DESCRIPTION: SEQ ID NO:28 
CAGGAGTATT TGGTGATGCG 



(2) INFORMATION FOR SEQ ID NO: 29: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:29 
CGACGGGGAG AAGGTGACGG 



{2} INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
{ D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30 
AAAACTTCTA CCAACAATGG 



(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31 
CGTAATCTCT CTCGATTAGC 
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(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 
CCGTGGGATG GCTACTTGCC 



(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 
TGGATTTGTG GCACGAGCGG 



(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 
TTGATTGCCT CTCCTCGTCC 



(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 
ATCAACATCT GATTGATTCC 
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(2) INFORMATION FOR SEQ ID NO:36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 
CAGCGAGCGC AT G C AAC TAT AT AT T GAG C A GG 



(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 base pairs 

( B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:37: 
AATAAATATT TAAAT ATT C A GATATACCCT GAACT CTACA G 



(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 45 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 
AAACTGTAGA GTTCAGGGTA TATCTGAATA TTTAAATATT TATTC 



(2) INFORMATION FOR SEQ ID NO:39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 
GTACGTGGAG CTCTGCAACT AT AT AT T GAG CAGG 
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(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40 
AT GACACT GC AGGATAGTTC CCTTCGTTCG GG 



(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41 
GTGTTGCATC AGTTCATTCC 



(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
{ D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42 
GCTGTGCTAG AAGT CAGAGG 



(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 20 base pairs 
{ B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43 
GTTCTCCTTG GAATTCATCC 
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(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44 
AGTATATCTA GATGTGCGAG TCTCTGCCAA TT 



(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45 
AGTAATTGTA CATTTAGTGG 



(2) INFORMATION FOR SEQ ID NO:46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46 
ATTAACCTTA CTTACTTACC 



(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 20 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47 
CTAAACTAAG TAATATAACC 
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(2) INFORMATION FOR SEQ ID NO : 4 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:48 
GTTGATTCTT T GAGC ACT G G 



(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 



{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4 9 
AATTCGACCA ATT AC AT T G G 



(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:50 
AACATAGTTG TTGAGGAAGG 



(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:51 
AATTAATGGA GATTCTACGG 



WO 99/10482 



24/26 



(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52 
TCAGCATCTA GAAATGCAGG 



(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:53 
CGAATGTCAA CATTCACTGG 



(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54 
CTTAACCTGA TGTGTACTCG 



(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55 
ATGAAGCTTT AG AG GAT G C C 
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(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56 
CGACGAATTT CTGGAGTCGG 



(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57 
ACT GCAT TAT CCATTAATCC 



(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58 
CACCCAAATA ACATCTATCC 



(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59 
TTTAACCTCA TCTTCGCTGG 
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(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:60 
ATGTTCCGCA AGCTTGGTTC 



(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61 
TTTAATTACC CAAGTTTGAG 



(2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62 



TTTTAACCCA GTTACTCAAG 
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3. I I As only some of the required additional search fees were timely paid by the applicant, this International Search Report 
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Although claims 18-27 are directed to a method of treatment of the 
human/animal body, the search has been carried out and based on the 
alleged effects of the compound/composition. 



The claims 18-27, referring to compounds interfering with the enzymatic 
activity of the claimed proteins, could not be searched completely due to 
the lack of support of these compounds in the application. 



